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Device Photolithography 


Foreword 


The fabrication of semiconductor and thin-film integrated circuits 
requires the delineation of precisely defined patterns in various ma- 
terials in order to obtain the required functional performance of the 
device. Photolithographic processing has primarily been used for this 
purpose, requiring that masks be generated as the basic “tool” for 
producing integrated circuits. This issue is devoted to a detailed 
description of a new mask-making system intended to satisfy the Bell 
System’s requirements for increasing numbers of increasingly complex 
masks. The system features high precision and large throughput made 
possible by a specially designed family of machines linked together by 
a computer-controlled information system. 

The heart of the system is the primary pattern generator (PPG) 
which produces the original artwork by scanning a tv-like raster 
pattern on a photographic plate with a focused laser beam. The hori- 
zontal deflection of the beam is provided by reflecting it off a spinning 
polygonal mirror while the vertical motion of the plate is provided by 
a precision stepping table. The laser beam is modulated by an acousto- 
optic element under the control of a digital data stream which con- 
tains the topographic information. The machine is capable of generat- 
ing a 22-cm by 18-cm pattern with an address structure of 32,000 by 
26,000 units in about 10 minutes. It provides a reproducibility of one 
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part in 25,000 and an absolute accuracy greater than one part in 
10,000. The reduction cameras and step-and-repeat camera that com- 
plete the system were designed to fully exploit the high-speed and 
accuracy capabilities of the PPG. Looking ahead to future device 
applications in which the higher resolution offered by an electron beam 
generator could be of importance, development work on such a unit 
is also described. 

The articles in this issue discuss: (7) the overall system, including 
the engineering considerations that led to the choice of pattern gen- 
eration; (tz) the computer programs required to transform topographic 
information into a digital] data stream suitable for control of either the 
PPG or the electron beam pattern generator; (vz) the PPG, including 
the optical, mechanical and electrical design features; (iv) the electron 
beam pattern generator; (v) electron-sensitive materials for use with 
the electron beam machine; (vi) the design and characteristics of the 
lenses used in the mask-making system; (v7) the optical and 
mechanical design of the reduction cameras; (vw) the optical and 
mechanical design of the computer-controlled step-and-repeat camera; 
(zz) the thin photosensitive materials required for use in the above 
cameras; (%) the specially designed coordinate-measuring machine 
used to inspect masks and to maintain the mask-making system; and 
(zz) the information system which controls the flow of work through 
the mask laboratory. 

Many people, too numerous to mention, throughout Bell Labora- 
tories and Western Electric Company, have made significant contri- 
butions to the development of this mask-making system. Their efforts 
have led to the successful installation and operation of two mask 
laboratories, one at Murray Hill, New Jersey, and one at Allentown, 
Pennsylvania. 

FRANKLIN H. BLECHER 


Device Photolithography: 


An Overview of the New Mask-Making 
System 


By F. L. HOWLAND and K. M. POOLE 
(Manuscript received July 9, 1970) 


This paper reviews how photolithographic masks for stlicon and thin- 
jilm integrated circuits are made. Increasing production and complexity of 
masks makes heavy demands on the operating time, reproducibility, and 
accuracy of the new mask-making system. The pattern generation step, in 
which the design its converted to a photographic image, is critical to the 
system. Advantages and disadvantages of other pattern-producing methods 
are discussed. The technique of producing patterns by optically scanning 
lines with a rotating mirror while mechanically stepping the photographic 
plate is described. This article develops the basic design parameters of 
address structure and operational speed for the primary pattern generator, 
and i defines the requirements for reduction cameras and the step-and- 
repeat camera for a system capable of meeting the needs for both thin-film 
and silicon integrated circuits. The article notes the system limitations 
imposed by optical generation of patterns and lens tolerances. 


I. INTRODUCTION 


The Electronic Materials and Components Development Area of 
Bell Telephone Laboratories has made the development of hybrid- 
integrated electronics, combining semiconductor and thin-film tech- 
nologies, its major general field of activity for several years. Silicon 
integrated circuits provide the active elements for both digital and 
analog systems, and passive components can be incorporated if tol- 
erances are not too tight. Thin-film circuits based on tantalum can 
provide stable resistors and capacitors which can be trimmed to precise 
values, while other thin-metal films can be used advantageously for 
conductors. Thus silicon and thin-film technologies together provide a 
sufficient set of elementary components for most systems functions. 
Equally important, the choice of silicon circuits made in the beam- 
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leaded, sealed-junction form and thin-film elements on ceramic or simi- 
lar substrates give us complementary technologies which are physically 
compatible. 

Both parts of this hybrid-device technology have come to depend 
primarily on photolithographic methods for delineating the areas in 
which material will be added, removed, or modified as the original sub- 
strate is successively transformed into the final circuit. Both parts of 
this technology have grown in volume of activity and in sophistication 
of technique. In doing so they have put increasing demands on mask- 
making laboratories for more masks per year and for more complex 
mask patterns. 

The system described in this issue of The Bell System Technical 
Journal provides for both semiconductor and thin-film integrated cir- 
cuits using facilities that are coupled by an information system. The 
mask-making system is designed to have the capability of meeting the 
demands for larger numbers of increasingly complex masks with a 
known time interval between the receipt of design information and 
the delivery of a complete set of masks. 


II. HISTORICAL BACKGROUND 


All mask-making systems can be described schematically as shown 
in Fig. 1. Two streams of information, one topographic and the second 
descriptive, must be provided. 

The topographic stream starts with the designer who generates the 
input information on the topography for each mask level and stores the 
information using a program such as xymMAsk. The information thus 
generated is not suitable for direct use in making artwork, so a post- 
processor is used to modify data and make it compatible with a spe- 
cific artwork-generating system. After the processing and, if necessary, 
recycling to eliminate errors, the output data can be used to drive the 
artwork-generating equipment. 

After the artwork is generated, a series of photo-reductions are per- 
formed and, if required, an array of images is produced using a step- 
and-repeat camera to produce the master photo mask. From this mas- 
ter, working copies are generated, the specific process depending on the 
ultimate need. Working copies can be emulsion or chrome on glass for 
semiconductor circuits, or emulsion on glass or transparent plastic for 
thin-film applications. 

In parallel with the topographic information, descriptive informa- 
tion is also required. The descriptive information includes the tone of 
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Fig. 1—Schematic of the mask-making process. 


the mask; that is, are there clear features on an opaque background 
or are there opaque features on a clear background? The tone is 
established by the specific process to be used for delineating the pat- 
tern in the final product. For masks requiring the step-and-repeat op- 
eration to generate the array, information concerning the specific pat- 
tern of images must be defined and the necessary data generated for 
producing the array. Finally, the descriptive information must include 
drawing numbers, tolerances, and critical features to be used as inspec- 
tion points; this information relates to the final inspection of the 
master and working copies. The descriptive information is as critical in 
mask making as the topographic information. Because of the combined 
topographic and descriptive information paths and the complex of 
processes, management of a mask-making laboratory is a very impor- 
tant part of the system. 
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As device complexity has increased, with a consequent increase in 
the amount of data required to describe the topography of an image, 
computer-controlled artwork generators have been developed. Two 
distinct types of artwork-generating equipment have evolved. The 
first are mechanical systems, such as a coordinatograph, the Gerber,* 
or a mechanical reticle generator which operates by moving a gen- 
erating head on a mechanical XY stage or moving the recording 
medium past a fixed optical head. The second type uses an electron 
beam and camera to generate the artwork. 

The mechanical systems which generate the artwork feature-by- 
feature have a potential address structure that is not fully utilizable 
because of errors in the mechanical systems. In general, however, they 
can be operated reproducibly with 6000 addresses in the X and Y 
directions. Because of the nature of the mechanical motion, the time 
required to produce a given piece of artwork is sensitive to both 
the complexity and the size of the feature. 

An example of the use of an electron beam and camera system is 
the SC 4020.* This system is capable of generating a pattern at 
electronic speeds by moving an electron beam over a cathode ray tube 
and photographing the image. It produces a mask rapidly but the 
address structure is limited and, as a consequence, it can only be 
used for low-precision artwork generation. 

After the artwork is generated it is, in general, reduced in size. 
Typical reduction cameras for both silicon and thin-film circuitry 
produce images that are reduced by a factor of from 10 to 30 from 
the original artwork. These cameras are all physically large and re- 
quire high-quality lenses to minimize distortion. At this step the 
master mask for thin-film applications is produced. Working copies 
for device processing are generated by contact printing. 

For silicon integrated circuits the image produced by the reduction 
camera is typically ten times the final size. The final reduction and the 
fabrication of the circuit array is done on a step-and-repeat camera. 
Because of the complexity of the array, in terms of the variety of images 
to be produced, the cameras are computer controlled. For a typical mask 
the primary interest is, of course, the formation of an array of precisely 
placed images of the primary pattern that is required for the fabrication 
of the working device. In addition, however, special patterns such as 
test patterns for checking processing and alignment features are also 


* Gerber Scientific Instruments Company, South Windsor, Connecticut. 
+ Stromberg Data Graphics, San Diego; California. 
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required. Since a typical semiconductor integrated circuit requires from 
nine to twelve mask levels to complete the device fabrication, the step- 
and-repeat camera must provide not only for the final optical reduction 
but also for the precisely controlled and reproducible positioning of the 
images so that registration from one mask to another in the set is 
achieved. In the past step-and-repeat cameras could place an image with 
a reproducibility of +1.5 wm. However, the errors in the mechanical 
drive and position-sensing systems made absolute positioning con- 
siderably less accurate. 


III. MASK-MAKING PRECISION, STANDARDS, AND CAPACITY 


With this background of the mask-making process and the then- 
available equipment to produce the mask, the changing complexity, as 
measured by the number of coordinates required to describe the 
image of the masks for both silicon and thin-film circuits, has had 
a major impact on the capability of mask-making systems to meet 
the demands. Projection of our future needs for integrated-circuit 
masks suggested that we will have to provide for: (7) a minimum 
feature size five thousand times smaller in linear dimension than the 
over-all size of the circuit pattern; (7) incremental sizes of about 
one-fifth of this minimum feature size; (72) reproducibility of about 
one part in 25,000; and (iv) absolute accuracy of about one part in 
10,000 (both reproducibility and accuracy being referred to the over- 
all size of the pattern). Examination of the state of the art of lens 
design suggested that cameras could be built to be consistent with these 
needs, provided that we adopted a set of standard mask formats and 
that we designed lenses and cameras for each standard field size 
and reduction ratio.* 

Such a set of standards has been chosen (Table I). They provide 
for large thin-film circuits with a nominal field size of 12.5 cm and 
a smaller format, 5 em, which both provides for medium-sized thin- 


TABLE I—SranpArD MASK SIZES 








Principal Minimum Address 
Function Field Size | Line Width Size 
Thin-film 12.5 em 25 um 5 ‘um 
circuits 5.0 cm 10 um 2 ‘um 
2.5 em 5 pm lym 

Semiconductor 5.0 mm 1 ym 0.2 pm 


circuits 
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film circuits and serves as an intermediate step in semiconductor- 
mask fabrication. A third standard may become necessary for small, 
fine-lined, thin-film masks and appropriate values are listed in Table 
I. Semiconductor integrated circuits seem likely to remain under 5 
mm square, and a single standard field for a step-and-repeat camera 
is sufficient. This set of standards embodies (7) a decision to “go 
metric” in device design, (ii) a compromise between design flexibility 
and the capital cost of equipment, and (77) a preference that the 
address units, which quantize internal device dimensions, be such 
that large integral multiples be immediately identifiable. 

In the same period of time in which the growth in the complexity of 
mask patterns has occurred there has been a parallel increase in the 
demand for numbers of masks. This growth has been the direct result 
of a need for larger numbers of masks to fabricate a given device 
coupled with an increase in the number of designs. To illustrate this 
growth of demand, information has been collected from a variety of 
Bell Laboratories groups covering the period from 1966 to the pres- 
ent and estimating the needs for the early 1970s. The results are shown 
in Fig. 2. 

The growth in demand for silicon integrated circuits, SIC, from 
1966 through 1969 has been nearly exponential and has been in part 
inhibited by our inability to produce sufficient quantities of masks. 
Because of the increased numbers of people designing integrated cir- 
cuits, the growth will continue to be slightly greater than linear during 
the early 1970s. Thus, somewhere between 7,500 and 8,000 pieces of 
artwork per year will be required by 1972 or 1973. 
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Fig. 2—Growth in demand for artwork for silicon and thin-film integrated circuits. 
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Because the silicon integrated circuit and thin-film circuits are 
intimately connected in design, it can be expected that the need for 
thin-film masks, TIC, will also rise during the early 1970s as shown 
in Fig. 2. In part, this growth represents the need for increasing num- 
bers of masks for crossovers and tantalum circuits that are combina- 
tions of resistors, capacitors, and crossovers. 

If we take the composite of these two trends, we find that develop- 
ment activities will require that approximately 14,000 pieces of art- 
work be generated per year by 1972. To meet this demand, it was 
decided to build two mask-making laboratories, one at the Murray 
Hill, New Jersey, location and one at the Allentown, Pennsylvania, lo- 
cation. Each laboratory was to have a master mask capacity of 
10,000 per year. 


IV. CHOICE OF PATTERN GENERATOR 


Pattern generation is a key element in the total process of mask- 
making in the sense that the difficulty of meeting the many demands 
placed on this step is so great that the adjacent steps of the process 
must largely be tailored to the choice of pattern generator. The over- 
all process resulting from each plausible choice of pattern generator 
design must then be evaluated before a final system choice is made. 

The nature of the problem logically requires relative motion in 
two dimensions between a writing element and a recording medium. 
The functional requirements which have been discussed in the previous 
section suggest a digitally controlled plotter having resolution cor- 
responding to 25,000 by 25,000 address points in the pattern field 
and a plotting time for the more complex patterns of about 10 minutes. 

Reviewing the pattern generators which have previously been used, 
we first have machines such as automatic coordinatographs and auto- 
matic drafting machines with optical exposure heads. A machine of 
this type could be designed to give the desired resolution. The plotting 
time for complex patterns on such machines has already exceeded ten 
hours. Another approach is the reticle generator which makes a set 
of elementary figures available from which every mask will be 
assembled. We have not found any set of figures which offer sufficient 
speed and flexibility. 

The following three approaches to pattern generation appear to 
have sufficient resolution, accuracy and speed to meet our require- 
ments: drum recording, electron-beam recording and light deflection. 
Each is discussed in turn. 
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4.1 Drum Recording 

In the drum recorder the recording medium is wrapped around a 
cylinder as shown in Fig. 3. The two dimensions of motion are now 
achieved by synchronizing the rotation of the drum and translation 
of either the drum or writing head parallel to the axis of rotation. 
If we insist on a system capable of writing on various areas of the 
recording medium in an arbitrary sequence (random access), this 
system offers no advantage over a flat-bed plotter; however, it does 
make it possible to create any pattern by continuous rotation of the 
drum and a synchronized translation. After unwrapping the record- 
ing medium, the image would appear as though it had been created 
by a TV-like raster. It is this concept of a uniformly swept raster 
which makes a mechanically scanned system feasible. 

This pattern generator could be engineered within a relatively wide 
range of sizes, tolerances on the precision of the translational mech- 
anism, on the concentricity of the drum, and on the thickness of the 
recording medium becoming increasingly tight in smaller machine 
sizes. A 12.5-cm pattern size would be possible, while a 25-cm size 
unit would be relatively simple to develop. The primary problem in 
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Fig. 3—Schematic of a drum plotter. 
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this approach is that the recording medium must be flexible. The 
combination of a silver halide emulsion on a film base does not have 
sufficient dimensional stability for our purposes. An alternative which 
was considered was laser machining some appropriate coating from 
a metal based multi-layer medium. Brief experiments suggested that 
such a medium would not be easy to handle and, being opaque, 
would have to be used in front-lighted reduction cameras. Such cam- 
eras are inefficient and the drum approach was dropped from further 
consideration. 


4.2 Electron Beam Recording 


An electron beam machine in which a finely focussed beam writes 
directly on a recording medium of appropriate resolution and sensi- 
tivity is a probable approach to pattern generation. An electron beam 
recorder can be designed for a beam size of a few microns and a 
field of several centimeters.? Choice of a 5-cm field allows direct 
generation of one standard format and allows the other standard 
sizes to be produced in cameras using glass condenser illumination. 
Pattern description for this system is a simple extension of previous 
work for cathode ray tube systems. This technique seems to offer 
system compatibility; the major uncertainties which existed at the 
time at which a selection had to be made (November 1967) were 
whether the desired accuracy could be obtained, and whether the 
sensitivity of electron beam systems to unwanted electric and mag- 
netic fields would limit its reproducibility. These uncertainties were 
sufficiently great that this approach was not chosen for our initial 
system, but development work was continued to provide a compatible 
system which might be advantageous for future large-area devices 
such as color and document-mode Picturephone® camera tubes and 
magnetic domain devices. This machine is described in a companion 
paper.? 


4.3 Light Deflection 


Of the three approaches, only deflection of a light beam seemed 
capable of meeting our anticipated requirements. Since the combina- 
tion of plotting time and number of resolvable elementary areas in 
the pattern field requires exposure times of less than one microsec- 
ond per resolvable area, the use of a laser beam to achieve a small, 
very bright writing spot was indicated. Deflection of a laser beam 
can be accomplished by electro-optic or acousto-optic elements, but 
available deflector materials were not of sufficient quality to give 
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plotting times less than one or two hours. Reflection from a spinning 
mirror, however, can give speeds up to and beyond those required as 
long as we accept a uniformly rotating mirror as the basis for our 
system. This led to a rotating-mirror pattern generator design where 
a modulated light beam would be swept across a photographic plate 
in one direction at a rate of about 50 scans per second, while the 
plate holder would move in the direction perpendicular to the scan 
lines. In less than ten minutes 25,000 overlapping scan lines could 
build up the complete pattern image. Again, in this system, we have 
employed continuous rotation of the higher-speed scanning member 
to achieve the desired plotting rate in a mechanical system. Imple- 
menting this approach requires that a lens be mounted adjacent to 
the rotating mirror, a diverging input beam being collimated by the 
lens and refocussed onto the recording medium after reflection. Be- 
cause of the inverse relationship between the aperture of a lens and 
the diameter of the smallest spot which the lens can image and be- 
cause the field angle for which a lens can be designed is sensitive to 
the relative aperture size, the lens and mirror sizes enlarge rapidly 
as the desired pattern size is diminished.’ Specifically, the design ap- 
pears impracticable at the largest standard pattern size of Table I 
and relatively easy at a 25-cm pattern size. Thus, the initial pattern 
size for this machine design is rather firmly bounded by optical-design 
considerations on the one hand and by considerations of plate size, 
governing the size of both processing equipment and reduction cameras, 
on the other. 8 by 10 inch photographic plates are commercially 
available and, in 14 inch thickness, can be obtained with sufficient 
flatness. Translating to metric units gives a maximum usable area of 
about 13.8 cm by 23.4 cm. This puts an upper bound of 7.3 ym on the 
address unit size, and 7.0 ym seems a reasonable value. A review of 
the optical design based on this value led to reasonable sizes for the 
individual components and for the over-all machine. 

Pattern description for the primary pattern generator (PPG) re- 
quires that the topographical data be sorted into a sequence controlled 
by the directions of scan, and presented to the generator at a pre- 
determined rate. These are novel requirements relative to our experi- 
ence in computer aids to mask-making.t While the sorting operation 
requires large files in the off-line data-processing system, the operation 
is not a costly one. A larger problem is created by the need to present 
data to the generator from its on-line controlled computer at a prede- 
termined rate of about 2 million bits per second. The strategy used to 
meet this demand is such that most of the core memory is required for 
storage of coded data describing the current scan line and the changes 
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required to go from the current line to those immediately following, 
and thus all characteristics of features, particularly where they in- 
clude slant and curved edges, have to be computed off-line and coded 
for transfer by means of a magnetic tape. At this time this is a sig- 
nificant disadvantage in the choice of the PPG as opposed to a 
random-access generator such as the electron beam machine. 

The characteristics of the PPG previously discussed determine the 
design requirements which it must meet. With reference to Table 
I, it is evident that for thin-film circuits optical reduction of the 
image plate from the PPG is required. A reduction camera that 
reduces the image 1.4 times is required for the bulk of the thin-film 
circuits that have a minimum line width of 25 ym. A second camera 
with a 3.5 reduction ratio is also required for 10 micrometer minimum 
lines on a smaller field. This camera is also used for silicon integrated 
circuits. A third reduction camera for 5-um lines may be required 
in the future if 5-mm lines are required on small areas. Conventional 
glass condenser systems are not practical for these cameras, and large 
area, diffuse sources with Fresnel lens condenser systems are used to 
meet our requirements.> The cameras have been designed with no 
operator adjustments for either reduction ratio or focus. 

For silicon integrated circuits the image produced by the 3.5X re- 
duction camera is used as the reticle in the step-and-repeat camera 
which provides an additional 10-times reduction.* The step-and-repeat 
camera, in addition, generates an array of images—each with a 5-mm 
maximum field size and a maximum array size of 10 cm by 10 cm. 


V. SYSTEM DESIGN 


In completing our account of the new mask-making system, we 
should recognize that not all devices are square. Many thin-film in- 
tegrated circuits are rectangular. As long as a camera is to be used 
to image a rectangular pattern, the diagonal measure of the pattern 
is a dominant consideration. It is not necessary, however, to com- 
pound this penalty by fitting a square pattern field within the cir- 
cular field of the cameras and then constraining a rectangular pattern 
to lie within the square. Thus the field of the pattern generator was 
enlarged from 25,000 address units square to 32,000 units (22.4 cm) 
and, at the same time we enlarged the width to 26,000 units (18.2 cm) | 
since the space was available. 

Fiducial marks which provide for registration of patterns in the 
step-and-repeat camera are plotted in the corners of the 32,000- by 
26,000-unit rectangle. In addition, the pattern generator writes two 
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strips of system data, one above and one below the rectangle. The 
first strip shows the identification number of the particular pattern 
generator used and the sequence number in octal form. The second 
strip contains the drawing number of the pattern in three forms. One 
is the normal form for the direct use of mask shop operators, but 
in addition the number is repeated in two binary-coded formats suit- 
able for machine reading. One is designed to be read when the pattern 
generator plate is in the reduction camera and the other to be imaged 
by the 5-cm field-reduction camera and read when the resulting 
reticle is in the step-and-repeat camera. 

These provisions for machine reading of the drawing number are 
part of a supervisory and scheduling system known as the Mask 
Shop Information System (MSIS).’ Earlier experience with mask- 
making laboratories of more modest capacity than our 10,000 per 
year objective taught us that the scheduling system can be the factor 
determining the time to complete a job. The equipment design which 
has been outlined here and which will be detailed in the following 
papers can therefore shorten the time to complete a job only if we 
add a system for storage and rapid retrieval of all the data required 
to make and inspect the masks and keep the necessary records. Sched- 
uling each phase of each job is included; as each step after pattern 
generation is due, the MSIS displays to the camera operator the 
drawing number of the pattern generator plate or reticle and the lo- 
cation of that plate in the physical storage trays provided. The 
system then reads the plate number and advises the operator if an 
error has been made. At the step-and-repeat stage, all data describ- 
ing the step-and-repeat array is fed to the on-line control computer.° 


VI. SYSTEM APPRAISAL 


While we have not yet had sufficient experience with MSIS, nor with 
a level of demand for masks which would have fully exercised MSIS, 
we can make a preliminary appraisal of the remainder of the system. 

The PPG has accomplished essentially everything we set out to do. 
For the first time in many years, artwork generation is no longer the 
pacing item in mask making; we have a machine which takes simple 
patterns or patterns of a complexity we would not previously have 
attempted, makes patterns in which 10 percent of the area is exposed 
or patterns in which 90 percent is exposed, semiconductor device pat- 
terns, thin-film patterns, test patterns—and even digitized photo- 
graphs—and turns them out with inhuman regularity. While the 
optical-design pattern bound us into a very narrow size range, the 
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resulting machine is the right size for the operator’s convenience. This 
is not to say that there is no room for further improvement in the area 
of artwork generation. We see future device applications in which the 
higher resolution offered by an electron beam machine could be of 
major importance, sufficient to justify incorporating such a unit— 
compatible with the PPG system standards in format and plate size— 
into the mask-making laboratories. 

Turning to the reduction cameras, we feel that the basic system 
decisions which were made—separate fixed cameras using Fresnel con- 
denser illumination with monochromatic light—were sound. We do 
believe that further improvements in system performance might be 
obtained through achieving closer tolerances in lens fabrication; 
essentially the state of the art of lens design has run ahead of lens 
assembly techniques. This comment applies even more strongly to 
lenses, such as the one for the step-and-repeat camera, which are 
aimed at feature sizes of a few wavelengths of light. The step-and- 
repeat camera lens proved extremely difficult to build, and appears to 
have distortion of about one part in 5,000 arising from fabrication 
tolerances; we would argue that paper designs of lenses of higher per- 
formance—perhaps seeking comparable resolutions over a larger field 
—should be held suspect until actual models are built and tested. 

The new step-and-repeat camera is a development of a different 
kind from most of the other parts of this program. No single charac- 
teristic of this unit shows an order of magnitude improvement over 
earlier equipment, nor does it contain conceptually new major ele- 
ments. The improvements which have been made, factors of two or 
three in smallest feature width, in linear field dimensions, in linear 
array dimensions, and in speed, are cumulative in their impact and are 
essential to the satisfaction of our anticipated needs. 
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Computer systems play a fundamental role in the operation of precision 
untegrated-circutt pattern generators. This paper first describes the XYMASK 
system which provides a language for describing the geometric shapes in 
a set of masks and generates graphical artwork on a number of different 
pattern generators. The remainder of the paper is devoted to discussions 
of system-design considerations and algorithms for generating input to 
the primary pattern generator and the electron beam machine. 


I. INTRODUCTION 


Computers are indispensable today in the operation of any sizable 
mask-making laboratory. Nearly all precision pattern generators are 
either directly computer controlled or else require input of a form 
which can be reasonably obtained only through the use of computers. 
Furthermore, the complexity and sheer volume of masks currently 
required effectively prohibit nonautomated procedures. 

The mask-making laboratory system described in this issue relies 
heavily on the use of computers. The first part of this paper describes 
a system of programs which links a circuit designer to the mask- 
fabrication processes; the next two sections discuss algorithms and 
programs for generating input to the primary pattern generator (PPG) 
and the electron beam machine (EBM). 


1.1 Computer-Aided Generation of IC Masks 

Masks are tools required in the fabrication of integrated circuits 
and other devices. The starting point in mask design is thus an 
electrical schematic or logic diagram of the desired device. An engi- 
neer or technician first allocates scaled geometric shapes to each of 
the circuit components; he then arranges and rearranges these shapes 
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on a similarly scaled substrate area. During this placement phase, 
many criteria are generally involved in evaluating the suitability or 
desirability of one arrangement over another. Some examples are 
thermal interaction, packing density, and the ability to realize the 
required component interconnections. The latter criterion is really 
applied in the next phase wherein the interconnection pattern is de- 
signed in detail. For most cases, several iterations between the place- 
ment- and interconnection-design phases are required before a satis- 
factory layout is obtained. At this point the geometric details of all 
the required masks are completely known; the next step in the process 
is mask generation. 

The draftsman or engineer is now faced with the problem of trans- 
forming the mask layouts into a form suitable for driving a pattern 
generator. The severity of this problem depends on two factors: the 
form of input required by the particular pattern generator, and the 
complexity of the masks. For pattern generators which are concerned 
solely with the outline of the geometric features, such as an automatic 
knife coordinatograph cutting rubylith, the solution is tedious but 
straightforward. Hither manually or via a digitizer, the coordinates 
of the endpoints for each horizontal feature boundary line, followed 
by the coordinates of each vertical feature boundary line, can be 
recorded on punched paper tape for each mask level. This tape would 
then be processed by the coordinatograph, the rubylith master peeled, 
and the masks obtained after appropriate photographic processing of 
the rubylith master. However, for more sophisticated pattern gener- 
ators which operate by filling in the interior of mask features with 
beams of light or electrons on photographic film, substantial use of 
computers is necessary to convert the mask geometry into commands 
acceptable by the pattern generators. 


1.2 The xyMask System 


The system of programs in use at Bell Telephone Laboratories and 
Western Electric Company for computer-aided production of integrated- 
circuit masks is known as XyMAsK. First operational in late 1967 and 
subsequently improved and modified, the current version of XyMASK 
evolved from two earlier generations of mask-making programs. Three 
of the more important system-design goals may be stated as follows. 

(z) It should provide a standard user-input language for conveniently 

and efficiently describing mask-feature geometry. 
(it) Insofar as possible, the system should be independent of any 
particular graphical output device. 
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(72) The implementation should be highly independent of the host 
computer to enhance portability of both the system and the mask 
specifications. 

The first of these goals is extremely important. Its realization 
greatly facilitates the transmittal of device designs not only among 
Bell Laboratories locations but also between Bell Laboratories and 
Western Electric Company for production. Moreover, the user-input 
language is a vital factor in the interface between the mask designer 
and the system since its convenience and flexibility have a direct bear- 
ing on user acceptance and satisfaction. 

The second goal is a necessity due to the diversity and number of 
graphical output devices available at Bell Laboratories locations. In 
an indirect manner, attainment of this goal also simplifies the addi- 
tion of new output-device capability as we shall see below. 

The third goal arises from the use of different large-scale computers 
at Bell Laboratories and Western Electric locations and the ever- 
present possibility of new ones being acquired. The most important 
user benefit is the complete independence from any particular com- 
puter of the mask descriptions encoded in machine-readable form in 
the input language; the same mask-description input deck will pro- 
duce identical artwork on different computers. Again indirectly, attain- 
ment of this goal has simplified program implementation and main- 
tenance. The implementation is almost exclusively in a subset of 
FORTRAN Iv common to the IBM 360 and GE-635 computers; there 
is essentially one set of source-language programs which runs on the 
several different computers. 


1.3 The User-Input Language 


As a preliminary to discussing the system organization of XYMASK, 
it will be helpful to describe briefly the user-input language. A some- 
what more detailed description is given by B. R. Fowler’. Basically, 
the input language provides a vehicle for describing the various geo- 
metrical shapes contained in a mask or set of masks in a computer- 
readable form. As such, the most primitive statements in the language 
are used to specify the interiors of three basic geometrical shapes: 
rectangles, polygons, and paths. In this context, rectangles are defined 
to have their edges parallel to the coordinate axes and are specified 
by giving the coordinates of the vertices on either diagonal. The state- 
ment 


label RECT mask, 10, 20, 30, 40 
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illustrates the format of the primitive statements and defines a 
rectangle with the lower-left vertex at X = 10, Y = 20, and upper- 
right vertex at X = 30, Y = 40. The label and mask attributes are 
discussed below. The polygon primitive is used to define generalized 
polygons having either straight lines or circular arcs as edges. The 
shape and size are fixed by giving the coordinates of the vertices in 
the order in which they are encountered in either a clockwise or 
counterclockwise tour of the periphery. The path primitive is used to 
specify a path of given finite width. The size and shape are fixed by 
giving the width and the coordinates of the endpoints and breakpoints 
of the centerline as they are encountered in a tour along the path. The 
centerline may contain circular arcs as well as straight-line segments. 

The preceding paragraph discussed only the specification of the 
shapes and sizes of the basic geometrical features. The positions of 
these features on the masks may be specified in either of two ways. 
If a label attribute is not specified for the feature, the coordinate 
values define its position as well as its shape and size. On the other 
hand, if a label attribute is given, separate input-language statements 
must be used to specify the position. In addition to position, these 
statements also permit the orientation of the feature to be altered by 
reflection about either coordinate axis together with a rotation through 
an arbitrary angle. 

In general, a set of individual but inter-related masks is required 
in the fabrication sequence for an integrated-circuit device. A tran- 
sistor, for example, may require geometrical features on a number of 
different masks for forming collector, base, and emitter regions. The 
XYMASK user-input language allows specification of all geometrical 
features occurring in all required mask levels for a device in whatever 
intermixed order is most convenient for the user. In order to correlate 
the various features with the appropriate mask levels, a mask-level 
identification is required as part of the specification of the rectangle, 
polygon and path primitives. 

It is often desirable and useful to treat a group of geometrical 
shapes as a structural entity; for example, it is far more convenient 
to position a transistor at the required locations as a structural entity 
rather than as a set of individual primitive shapes. The user-input 
language allows this hierarchical nesting of structures to an arbitrary 
depth. In other words, it is possible to define a structure which con- 
tains structures of lower “order” as well as basic geometric shapes. 
The structure may be positioned on the masks, possibly with reor- 
ientation, as described above for simple geometric shapes. This hier- 
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archical structuring in conjunction with reorientation allows the user 
to take advantage of repetitions and symmetries in the design in order 
to reduce the number of statements and effort required to encode the 
design in the user language. 

Statements are also available in the input language to retrieve 
previously designed structures from xyMask libraries and to invoke 
component structure-design routines. Transistor designs are typical 
library entries. An integrated-circuit designer generally uses transistor 
designs which have been thoroughly tested and characterized. These 
designs are stored as library entries which contain the xyMasx lan- 
guage specification in the form of hierarchical groupings of the 
appropriate primitives. Library retrieval provides a sort of shorthand 
for the user in that only the particular library and the entry identifi- 
cation need be specified in the input deck in contrast to the equivalent 
set of XYMASK input statements. 

Computer programs have been developed to design certain com- 
ponents and structures used in integrated circuits. Pattern generation 
for thin-film meander resistors, and the generation of sheafs* of inter- 
connection paths are examples of such programs in current use. 

Versions of these programs, called design routines, have been inte- 
grated into the xyMasK system. A single statement in the input 
language allows the user to specify the desired routine together with 
whatever parameters are required. Output from the routine consists 
of xyMASK statements specifying the generated design. These state- 
ments are automatically incorporated into the user’s input. 

The final feature of the user language to be discussed deals with 
the specification of particular graphical devices and output options. 
Graphical output may be requested either in the form of outline 
drawings or finished artwork. The outline drawings are generally 
produced on line plotters and are used to verify that the mask descrip- 
tions as encoded in the input language are correct. As implied, only 
the outlines of the geometrical features are displayed. The finished 
artwork is the desired end product of the system; for plotters work- 
ing on photographic film, the interiors of the geometrical features have 
one tonality (clear or opaque) while the area which is exterior to all 
figures has the opposite tonality. 

A single statement is used to indicate the plotter and any pertinent 
parameters such as drawing type and scale factor. The user has the 

* A sheaf is a family of paths each member of which can be derived from a 
generic member by translating each of its path segments normally through a given 


distance and lengthening or shortening it as required to create a nested copy of 
the generic path. 
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capability of requesting individual drawings or artwork for any or 
all masks. He may also request composite drawings of any two or 
more masks. This latter feature is widely used for error checking. 


1.4 XYMASK System Organization 


A simplified diagram of the xymMask system is shown in Fig. 1. 
The major program segments are the input preprocessor, the input 
processor, the execute processor, and the family of device-dependent 
output postprocessors. Input to the system is a machine-readable 
description of the desired masks encoded in the xyMasxk user lan- 
guage. This input is free format and may be generated by hand 
encoding and keypunching, digitizing large-scale layouts, or by other 
computer programs such as interconnection-routing routines. 

The input first passes through the input preprocessor. All input 
statements other than design-routine invocations or library retrievals 
are transmitted to the expanded input file without change. When a 
design-routine invocation is found, control is passed to that design 
routine, and the generated xymMasK statements together with the 
invocation are transmitted to the expanded input file. Library retrievals 
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Fig. 1—The xyMmask system. 
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are treated similarly in that retrieval is made when the statement is 
encountered in the input deck; the retrieved xymMasKk statements 
along with the retrieval statement are transmitted to the expanded 
input file. At the conclusion of the preprocessor phase, then, the 
expanded input file contains the original input statements interpolated 
with the results of any design-routine invocations or library retrievals. 

The system design of the remainder of the xyMASK system was 
heavily influenced by the desired relative independence from any par- 
ticular graphical output device. Accordingly, output-device dependence 
is relegated to a family of postprocessors each of which receives in- 
put from a common file referred to as the ‘output file’. 

This output file contains a representation of each of the masks 
requested in the xyMASK input deck in a form such that all device- 
independent processing has already occurred. Each mask is represented 
by a separate subfile, and each subfile contains only the defining 
coordinates of individual paths and polygons in their final positions 
and orientations. 

The input and execute processors must then transform the ex- 
panded xyMAsK input statements into the form required for the 
output file. The most significant aspects of this transformation are as 
follows: removal of all hierarchy by generating new copies of the 
various primitives as required while simultaneously carrying out 
specified reorientation and positioning; and sorting the resulting 
primitives into separate sets according to their individual mask-level 
identifications. 

The above aspects of the transformation suggest that detailed 
descriptions of all required masks be available in memory in a 
convenient form prior to starting the transformation. Thus the input 
processor reads the input-language descriptions of the masks, makes 
extensive error checks, and stores the descriptions in a hierarchical 
data structure. Upon completion of this process, the execute processor 
comes into play to generate the output file from the data structure. 

When output-file generation is complete, the appropriate post- 
processor for the first mask is activated according to the output de- 
vice specified by the user. Upon completion, processing is initiated 
on the second, perhaps using a different postprocessor if the user so 
desired. In like fashion, the remainder of the output file is processed 
and the job terminates. 

Each postprocessor is responsible for the ultimate generation of 
artwork on a particular graphic-output device. In general, the post- 
processor output is a magnetic tape which drives the actual device, 
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although on-line devices, such as the sTarn? line-drawing plotter, are 
easily accommodated. We can again view a postprocessor as a data 
transformer; it is responsible for reading each path and polygon 
specification from the output file and generating the proper output- 
device commands or codes for plotting that figure. The system design 
is such that all postprocessors are essentially independent programs 
which receive all of their input from the xymAsk output file. The 
system is thus open ended in the sense that new postprocessors can 
be easily and conveniently added. 

With regard to execution times for a typical set of masks, the 
input and execute processors each require on the order of one-minute 
running time on an IBM 360/65. Postprocessor execution times are 
generally longer and tend to dominate other costs for the run. 

The following two sections of the paper are devoted to detailed 
discussions of specific postprocessors for the PPG and EBM plotters 
described elsewhere in this issue. The two differ fundamentally in the 
manner in which pictures are produced. The EBM is a random-access 
plotter; the order in which mask features are plotted is immaterial. 
The PPG, on the other hand, produces pictures using a raster-scan 
technique. The contributions of all features intersected by each scan 
line must be determined and transmitted to the device in the order 
needed to generate the picture. 

The PPG postprocessor was developed at Bell Laboratories, Murray 
Hill, New Jersey, by A. G. Gross. The EBM postprocessor was de- 
veloped at the Western Electric Engineering Research Center, Prince- 
ton, New Jersey, by Mrs. 8. B. Watkins and J. Raamot. 


II. THE PPG POSTPROCESSOR 


The operation and functioning of the PPG together with its con- 
trol computer are discussed in this issue by A. Zacharias, et al.? For 
convenience, we will briefly review here those aspects which are of 
importance to the postprocessor. 

For our purposes, we can consider the photographic plate plotting 
surface to be a rectangular lattice of 26,000 x 32,000 addressable 
points. A writing beam scans the lattice on a line-by-line basis, with 
the beam turned on at those address points interior to mask features, 
and off otherwise. The writing beam is controlled by a 26,000-bit 
display buffer in the control computer with each bit position repre- 
senting one address along the scan line; the beam is turned on or off 
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at an address according to whether the content of the corresponding 
bit position is one or zero. After completing a scan line, the bit con- 
figuration in the display buffer must in general be modified to cor- 
rectly represent the geometric detail in the next scan line. When 
updating is completed, the bit configuration is again used to modulate 
the writing beam; this cycle continues until all 32,000 scan lines 
have been completed. 


2.1 Interface between Postprocessor and Control Computer 

Let us for a moment consider the subsystem comprised of the PPG 
postprocessor and the control computer program. The postprocessor 
runs on a large central computer, receiving input from the XyMAsK 
output file discussed previously, and writing output on magnetic tape. 
The information is read from the magnetic tape by the control com- 
puter program and used to load and update the display buffer. The 
magnetic tape constitutes an interface between two computer pro- 
grams: the nature of the information on the tape can thus be varied 
to share, in some sense, the computational load between the two 
computers. 

At one extreme, essentially all computation can be made in the 
postprocessor. The magnetic tape contains 32,000 records, each repre- 
senting a complete 26,000-bit display buffer configuration. In this 
format each mask requires transmission of something like a billion 
bits between the computers. At the other extreme, the control com- 
puter can process the xyMAsxK output file and develop the display- 
buffer contents. Far too much computation is relegated to the control 
computer since display buffer regeneration cannot in general keep 
up with the pattern generator plotting rate. The result is a severe 
degradation in plotting time. 

A compromise between the above extremes can be reached by con- 
sidering the basic information required to properly load the display 
buffer. Let us see what is involved for an extremely simple mask 
containing a single vertical bar. For all sean lines which do not inter- 
sect the bar, the display buffer must contain all zero bits, while the 
bit configuration for the remaining scan lines is invariant and need 
only be set once. The basic data needed to load the display buffer 
involves only details of the changes, if any, in the bit configuration 
between successive scan lines. This is true even for complex masks 
since a high degree of similarity generally exists between one scan 
line and the next. One is thus naturally led to consider a magnetic 
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tape encoding scheme which takes advantage of these similarities by 
detailing only the required configuration changes from one scan line 
to the next. 

A complete description of the various commands used in the en- 
coding scheme appears in Ref. 3. The commands fall naturally into 
three groups. The first group contains commands of an incremental 
nature for updating the bit configuration in the display buffer. 
Various combinations of these commands may be used to indicate 
that strings of one or more bits in the buffer are to be set to 
zeros or ones as required for the next scan. All bits not referenced 
in this fashion represent recurring mask detail and are unchanged 
for the next scan. The second group of commands deals with com- 
plete scan-line configurations. Commands are provided for specifying 
that the bit configuration for the next N scan lines is invariant, con- 
tains all one bits, or contains all zero bits. Commands in the final 
group are used to pass various parameter values to the control com- 
puter and are not of interest here. 


2.2 Postprocessor Algorithms 

We turn now to the functioning of the postprocessor. The input 
data resides on the xyMAsk output file in the form of various 
parameter values and the coordinate specifications for the individual 
path and polygon geometric features in the mask or masks to be 
generated. The output is written on magnetic tape and consists of 
appropriate sequences of the commands discussed above. The neces- 
sary data processing can be iteratively characterized as follows: 
given the set of geometric figures intersected by the previous scan 
line, determine the set of figures intersected by the current scan line 
and compare the respective display buffer configurations; the result 
of this comparison is expressed in the encoding scheme and written 
onto tape. Iteration commences with a null set of figures in scan-line 
zero, and terminates when scan-line 32,000 has been processed. 

The practical aspects of the above characterization belie its sim- 
plicity of statement. A single mask may contain several thousand 
individual geometric features. Furthermore, the features occur on 
the xyMasK output file in random order with regard to geometric 
position in the mask. Finally, it is important to accelerate the scan- 
line comparison process by quickly detecting sequences of scan lines 
which have the same display buffer bit configuration. The following 
paragraphs give a description of the methods and algorithms which 
were used, 
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The coordinates of the mask features on the output file represent 
final device dimensions measured in micrometers from an arbitrary 
datum point. These coordinates must be scaled up by the appropriate 
factor to compensate for photographic reductions of the primary pat- 
tern, and converted to address units. A coordinate translation is then 
made to center the mask on the primary pattern plate. The post- 
processor is capable, at the user’s option, of generating either normal- 
tone masks having opaque features on a clear background, or reverse- 
tone masks displaying clear features on an opague background. It 
is an interesting and perhaps unique characteristic of the system 
that the two tonalities are produced with equal ease and facility. For 
simplicity, we will consider only normal-tone processing. 

Given the set of individual mask features as input and considering 
the raster-scan process by which the artwork is created, it is clear 
that we are primarily interested in the feature boundaries. Returning 
to the simple mask discussed above, the writing beam is switched on 
at the left boundary of the bar, remains on in the interior, and is 
switched off at the right edge. Thus for our purpose the rectangle is 
totally characterized by its left and right boundary lines together 
with their respective tonality shifts. More generally, each polygon 
feature in the mask can be similarly characterized by listing all of its 
boundary line segments not parallel to the scan-line direction, to- 
gether with the appropriate tonality transitions. Any arcs which occur 
are approximated by a sequence of chords and are thus reduced to 
sets of line segments. Since path features are described on the XYMASK 
output file by centerline coordinates and width, some additional 
computation is required. Any arcs in the centerline are first approx- 
imated by chords, and path outline then obtained by translating the 
centerline line segments normally through distances of plus and minus 
one-half the path width. The path then becomes a polygon and is 
treated as above. 


2.3 Postprocessor Structure 

A simplified diagram of the postprocessor is shown in Fig. 2. Each 
mask requires a complete pass through the system. The line segment 
decomposition routines read the mask-feature descriptions from the 
XYMASK output file, convert the coordinates into address units, de- 
compose each feature as described above, and write the resulting 
line segments with their tonality shifts onto the line-segment file. 
The set of line segments is next sorted into an order convenient for 
further processing. Each line segment is described by the two coordi- 
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Fig. 2—PPG postprocessor system. 


nate pairs of its endpoints. The endpoint which has the lower value 
for its Y coordinate is termed the lower endpoint. The sort is car- 
ried out using the lower endpoint Y value as the primary key, and 
the lower endpoint X value as secondary key. At the conclusion of 
the sort, the sorted-line-segment file contains the line segments in 
the order in which they are encountered by the raster scan. Line 
segments first encountered by scan N precede those first encountered 
by scan N + 1, and if several line segments are first encountered 
by scan N, they occur on the file in the order of increasing-scan posi- 
tions. 

The final section of the postprocessor reads the sorted-line-segment 
file, determines configuration changes between scan lines, and writes 
the appropriate commands on the PPG tape. This operation is car- 
ried out using a 26,000-bit image of the display buffer containing the 
bit configuration of the previous scan line and a linked list of 
all line segments contributing to the current scan line. The line- 
segment representation is compared to the bit-image configuration; 
any differences are appropriately encoded and written on the tape, 
and the relevant bits are changed in the bit image. When the com- 
parison has been completed, the list of relevant line segments is 
updated by deleting those which do not intersect the next scan line 
and interpolating any new ones which do from the sorted-line-seg- 
ment file. The scan routines are fairly simple but involve significant 
computer time. The postprocessor minimizes the number of scan 
comparisons by examining the line-segment list looking for scan lines 
which are identical to the previous one, or contain all-zeros or all- 
ones configurations. When such configurations are found, the scan 
comparison is bypassed, and the appropriate commands are written 
on tape. This capability allows very rapid processing for masks con- 
taining features having no slant-line boundaries. 

The postprocessor execution time varies considerably with the com- 
plexity of the mask being generated. A typical interconnection mask 
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ordinarily requires several minutes on an IBM 360/65 and writes 
something on the order of one-quarter-million bits on the output 
tape. 


III. EBM POSTPROCESSING AND ALGORITHMS 


This section describes a system of programs which interfaces the 
EBM pattern generator with xymMasx. This system consists of a post- 
processor within the xyMaAsk system and a program for the pattern 
generator controller. The following short description of the EBM 
pattern generator will give an insight into the data transformations 
performed in both the xyMasxK postprocessor and the control com- 
puter program. 


3.1 The HBM Pattern Generator 


The EBM is similar to a cathode ray tube; in both, a beam of 
electrons is focused and deflected to form a spot on a target. One 
difference is that in the EBM, the target is a high-resolution photo- 
graphic plate, whereas in a cathode ray tube it is a phosphor screen. 
As the electron beam hits the target, the electrons directly expose 
the photographic emulsion and thereby produce a fine spot. A de- 
tailed description of the EBM pattern generator is given in this 
issue by W. R. Samaroo, et al.* 

The EBM pattern generator includes a digital-control computer 
that drives, through appropriate interface equipment, a set of electro- 
static beam deflection plates located within the EBM. The electron 
beam position on the target is controlled to fill mask features by draw- 
ing a sequence of adjacent line segments parallel to one coordinate 
axis. Fill-line data in the form of position and length are transmitted 
from the control computer to the interface where the digital fill-line 
data are converted to a sequence of analog voltages that are applied 
to the deflection plates. . 

Since a typical mask pattern may contain an estimated 10? fill-lines, 
it is impractical to read or even to store this data in the control com- 
puter. To make data processing more practical, the following strategy 
is used for the EBM pattern generator: While the interface controls 
the drawing of one fill-line segment, the control computer calculates 
the position and length of the adjacent line segment. 

Input data to the control computer consists either of paths or of 
pairs of left-hand and right-hand boundaries specified by the end- 
points of straight-line segments or the endpoints and centers of 
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circular arcs. With this pattern-coding scheme, approximately 4000 
words are required to represent a typical mask pattern of 10° fill- 
lines. This small volume of input data facilitates data transfer from 
the xYMASK postprocessor to the control computer. It is the task of 
the postprocessor to read the xyMasxk output file, transform the data 
to right-hand and left-hand boundaries for the EBM, and to output 
this data. 


3.2 The EBM Postprocessor 

The EBM postprocessor is written in the *1 language® (read as 
star one) and in rorTRAN iv for the IBM 360/50 computer. *1 is 
used because of its inherent power in processing list-data structures 
and FORTRAN Iv is used for input, output, and some of the more com- 
plex calculations. 

The way the xyMasx output file describes the features of a mask 
does not conveniently distinguish for the EBM the areas inside and 
outside the periphery of each feature. Generally speaking, the more 
automatic the drawing device, the more work has to be done by a 
computer to obtain this information. Devices such as the coordinato- 
graph and the Calcomp plotter, for example, require data in a form very 
similar to that of the xyMasxK output file because these devices cut 
or draw along the periphery of each feature. Since the Calcomp plots 
are part of the “debug” steps and are used for alignment and correc- 
tion, no further processing is required. In the case of the coordi- 
natograph, an operator must further process the plots by deciding 
which sections of the rubylith are to remain as part of the mask and 
which are to be removed and then he manually removes the unwanted 
pieces. This step in mask making is computerized for the EBM. 

The EBM postprocessor must interpret the XyMaAsK output file to 
determine which points are inside or outside the periphery of each 
feature. The EBM postprocessor converts the xymMasK output file 
data into sets of left-hand and right-hand boundaries whose minimum 
and maximum Y coordinates, when connected, are parallel to the X 
axis. The more nonconvex the feature, the more difficult the task 
becomes. 

Since the EBM is a random-aceess plotter, the postprocessor proc- 
esses one path or polygon at a time before proceeding to the next fea- 
ture on the xyMasK output file. The data for a polygon are stored 
as a linked list in the *1 program. The program determines the lower 
left-hand and upper right-hand points by comparing the coordinates 
contained in the list. From this, two routes along the periphery are 
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established, which eventually yield sets of left-hand and right-hand 
boundaries. The actual structure of the list-processing algorithm is too 
complex to be described here in detail. 

One of the unique features of the EBM postprocessor is the interpre- 
tation of paths. As mentioned above, a path is described on the 
XYMASK output file as a centerline and a path width normal to the 
centerline. Postprocessors for drawing devices such as the coordinato- 
graph must translate this path information into a polygon before the 
feature can be plotted. In other words, the postprocessor must find 
the periphery points for the path. The EBM postprocessor takes ad- 
vantage of the form of the output file data by treating the path as 
the figure formed when a circular tool, having the path width as the 
diameter, is moved along the centerline. Rather than converting the 
path into polygon data and then processing the resulting polygon, the 
postprocessor passes the major portion of path processing onto the 
EBM’s control computer. The description of the control computer 
algorithms, which follows, will explain how this data is handled. 


3.3 EBM Control Computer Algorithm 

The control computer is capable of calculating the boundary and 
outline points in less time than it takes for the EBM to draw fill- 
lines. Thereby, the interface and EBM become the limiting factors 
in allowing the pattern generator to maintain an average pattern draw- 
ing time of one microsecond per addressable point for a significant set 
of masks. The calculations of the endpoints of fill-lines along the 
left-hand and right-hand boundaries are based on integer arithmetic.® 
The following example of straight-line-to-are boundaries illustrates 
the use of integer arithmetic in this application. 

Consider a set of boundaries consisting of the straight lme Y = 
Y = (A/B)X and the circular arc X? + Y? = R*. The constants 
A, B, and R? are integers calculated from the control computer input 
data. 

In integer arithmetic, the straight line is redefined as: 


F = BY — AX (1) 


where F represents a third dimension. Thus, the straight line can be 
considered as the intersection of two planes in F' space, with equation 
(1) defining one plane, and the XY plane the other. 

The introduction of the dimension F results in the following useful 
properties: 
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(zt) F is zero on the straight line and has opposite signs for points 
X, Y on opposite sides of the straight line. 
(2) There exists a single value of F for each point in the XY plane. 
(a7) F is an integer for all integer points X, Y. 
(w) There is no error in a sequence of integer solutions for F. 
(v) The smallest integer number is 1. If this is the smallest addressable 
unit in the graphic field, then all points X, Y that are within 1 unit 
of the true solution represent the true solution in the XY plane. 


These properties of F make it easy to form an algorithm for cal- 
culating integer points along a straight line. If equation (1) is evalu- 
ated at the point (0,0), then the resultant F is 0. Rather than evaluate 
equation (1) for F at all points, it is easier to calculate a change in 
F between adjacent integer points. The adjacent integer points (1,0), 
(0,1), and (1,1), (in the neighborhood of the straight line) have the 
integer F values of —A, +B and B — A respectively. According to 
property (z) the point (0,0) is on the straight line and the points 
(1,0) and (0,1) are on opposite sides of the line. According to prop- 
erties (i) and (iv), a step-by-step calculation of F values from the 
point (0,0) to (1,1) will result in the identical F value at the (1,1) 
point regardless of the steps taken en route. Choosing a sequence of 
points with the smallest F values guarantees that the points are as 
close to the straight lines as the address structure of the field allows. 

According to property (v), there may exist several integer values of 
X and Y that represent the true solution point of the straight line. 
This observation is used to form a more practical algorithm where only 
one addition and one test for sign of F per point is required to find 
the next integer point along the straight line. 

A circular arc is the other boundary considered in the example. The 
circular are is redefined in integer arithmetic as 


F=X°+Y’?—-P (2) 


where again, F represents an added dimension. The circular arc is 
thus formed in F space by the intersection of the XY plane with a 
parabaloid. The properties (7) through (v) also hold true for equa- 
tion (2). 

A sequence of integer points along a circle is computed by taking 
unit increments parallel to either the X or Y axis and computing the 
resultant F values; for a change of 1 addressable unit in the X direc- 
tion, F changes by 2X + 1. The corresponding computation is shift 
left, increment, and add. A test of sign of F determines whether the 
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next step increments X again or decrements Y. The coordinates thus 
generated are located along the circular are and form the mask-feature 
boundary points. 

It is also possible to construct an integer arithmetic algorithm 
to compute points along the outline of a path. According to the path 
definition, points on each side of the outline represent the envelope 
generated by a circle moving along the centerline as illustrated in 
Fig. 3. 

The path algorithm finds points along the outline by choosing points 
along the circle until the normal to the circle is aligned with the 
normal to the path centerline. The circle is then displaced along the 
path centerline and the above process is repeated. A separate but 
identical algorithm is used for finding points on the opposite outline. 
Fill-lines are drawn parallel to one coordinate axis between these 
points. While the above algorithm appears to be complicated, sur- 
prisingly few calculations are required to find the endpoints of the 
fill-lines. For example, the slope of a curve in the XY plane is given 
by the ratio of change of F for changes in the X and Y directions, 
where the change of F in both directions is already available from 
the straight-line and circular-are algorithms. The normal to a curve 
is the negative inverse of the slope, and thus the only additional com- 
putation required in the path algorithm is the comparison of a 
sequence of two integer ratios. 

As is evident from the above dincussion, only a few instructions 
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are required in the control computer to calculate the endpoints of 
a fill line between a set of boundaries. As a result of the redefinition 
of the problem in integer arithmetic, the calculations in most instances 
are completed before fill-line generation is finished allowing the EBM 
pattern generator to maintain the one microsecond per addressee 
point drawing speed. 

The postprocessor execution time varies with the complexity of the 
mask being generated but to a lesser degree than for the PPG post- 
processor. Several minutes on an IBM 360/65 are ordinarily required 
for a typical interconnection mask. 


IV. DISCUSSION 


Several computer systems used in the generation of integrated-circuit 
masks have been described in the preceding sections. The first sec- 
tions dealt with the xyMaAsk system which links the circuit designer to 
the mask-fabrication process. xYMASK provides a computer-inde- 
pendent language for describing the mask configurations, and pro- 
duces either outline drawings or mask artwork on one or more of a 
number of different graphical output devices. The majority of all 
Bell Laboratories and Western Electric Company masks are produced 
using the XYMASK system. 

The next two sections described xyMAskK subsystems which generate 
artwork on the PPG and EBM. These two plotters fundamentally 
differ in that the first uses a raster-scan technique, while the second 
is a random-access device. Each is supported by a dedicated control 
computer. The subsystem descriptions indicate a degree of similarity 
in postprocessor functions, but different approaches toward the divi- 
sion of the necessary computation between the postprocessor and the 
control computer. 
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Device Photolithography: 


The Primary Pattern Generator 


Introduction 


By K. M. POOLE 
(Manuscript received July 10, 1970) 


The need for a new, high-speed pattern generator capable of pro- 
ducing the more complex and precise circuit patterns required in the 
1970s has already been discussed.t This paper describes the design 
and operation of the Primary Pattern Generator (PPG) in some 
detail. For the convenience of the reader, the paper has been separated 
into four parts. Part I covers the optical design of the machine, in- 
cluding the considerations which led to the choice of an argon laser 
light source, a recording emulsion, and an optimum combination of 
spot size and brightness. The original choice of a mechanically scanned 
system was made on the premise that, with such an approach, the 
required accuracy could be built in and retained over many years 
of operation, and Part II discusses the principal considerations behind 
this premise. In that paper are discussed the dimensional stability of 
the structural materials and their use in an extremely stiff structure, 
the features provided to align the parts of the system to the required 
tolerances, and the design of drive systems, essentially free from both 
vibration and wear. The control of the machine to produce the pat- 
tern encoded on the input tape is discussed in Part III; Part IV deals 
with the methods used to align the assembled machine and details the 
pattern accuracy and reproducibility which was achieved. 

The PPG, a highly automated system requiring operator action at 
very few points in the cycle, is part of an overall system running 
under computer scheduling. Operator acceptance of the system has 
been excellent, perhaps due to the incorporation of status displays 
beyond the essential minimum including a real-time display of the 
pattern being produced. 
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Device Photolithography: 


The Primary Pattern Generator 


Part I-Optical Design 


By M. J. COWAN, D. R. HERRIOTT, A. M. JOHNSON and 
A. ZACHARIAS 


(Manuscript received July 10, 1970) 


I. INTRODUCTION 


The basic design concept of the primary pattern generator (PPG) 
is the production of a linearly scanning, small, constant-size light spot. 
The scanning system consists of a regular polygonal-prism mirror 
which rotates about its axis of highest symmetry. The mirror faces are 
used sequentially to reflect a collimated light beam into a lens (for 
example, the scanning lens of Fig. 1). The collimated light is focused 
to a spot which scans a line in the focal plane of the lens as the 
polygonal mirror rotates. Located in the focal plane of the lens is a 
flat, glass photographic plate. The glass plate is moved by the desired 
scan line separation during the time required to bring the succeeding 
mirror facet into proper position. 

The collimated beam incident onto the rotating mirror is formed 
by the scanning lens from a diverging beam obtained from a laser. 
The location of the reflecting mirror facet must be close to the 
aperture plane of the scanning lens in order to insure that the mode 
is not truncated by the physical lens apertures after the light is re- 
flected from the mirror facet. Translation of the reflecting facet will 
not affect the position of the focused spot; the spot position is uniquely 
determined by the directions of the incident collimated beam and of 
the reflecting mirror facet relative to the optic axis of the lens. A 
barrel distortion is designed into the scanning lens such that the 
linear velocity of the focused spot is proportional to the angular 
velocity of the rotating mirror. 

The machine just described is basically analog along its fast-scan 
axis, although it is digital along the slow (substrate translation) axis. 
Since the required reproducibility is greater than the required accuracy, 
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Fig. 1—Schematic of primary pattern generator. 


a digitally operating machine is more desirable than an analog ma- 
chine. The fast-axis can be made digital by using a separate beam 
to scan over a grating type of code plate. The location of this beam on 
the code plate tracks the position of the writing beam and generates 
timing pulses for a control computer. The resolution of the code 
plate must be as good as the reproducibility required; that is, the 
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code plate system must be capable of resolving 26,000 positions per 
scan length. 

The pattern size is principally established by the capabilities of the 
scanning lens. The minimum spot diameter is determined by the 
approximate diffraction limitation of equation (1),1 obtained when a 
lens aperture is uniformly illuminated. 


2J (x) \’ ur 
1) = (?2@)7, 2 =z (1 
Here, f, is the f-number of the lens forming the image I(r); r is the 
radial distance from the image center; J, is a constant proportional 
to the intensity illuminating the aperture; and A is the wavelength. 
Using this relation, we approximated the half-power diameter of a 
spot formed by such an illuminated lens to be 


D & 0.58f, Dinum, = 520nm. (2) 


We now consider that the polygonal mirror will have some wobble 
to its motion, and further, that all faces of the mirror will not be 
exactly parallel to the rotation axis. Consequently, to reduce the 
effect of these mirror defects on the pattern, the scanning lens should 
operate with as large a field angle as possible. This wide-angle 
requirement limits the f-number for which diffraction limited per- 
formance can be obtained in a lens. For a 48° field angle, calculations 
made by Tropel, Inc.,* showed that a minimum f-number of 13 could 
be used for good performance of the coding beam over the field. 
Using equation (2), a spot size of 7.5 »m half-power width is thus 
obtained; this will be approximately the size of the address unit. 
Since 26,000 address units are required for a full scan line, an 
address size of 7.0 wm will allow the full pattern of 26,000 by 32,000 
address units to fit on a standard 8” x 10” photographic plate. 

To produce a complete pattern in less than 10 minutes, each of the 
32,000 scan lines must be traversed in less than 20 ms. Since the 
writing-beam diameter will be less than twice the address spacing, 
the beam must sweep its own diameter in less than 800 ns. To 
produce sufficient exposure on high-resolution emulsion? requires a 
beam brightness obtainable only from a laser. However, the writing- 
beam power required is only 20 »W. Orthochromatic emulsion is de- 
sirable since it will allow a safelight environment. Thus an argon 
laser,® operating at 5145 A wavelength was chosen as the light source. 


* Located at 52 West Avenue, Fairport, New York. 
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It operates in the lowest transverse mode,‘ thus the radial intensity 
distribution anywhere in the beam path is gaussian. The output of 
the laser is stabilized by feedback through the laser power supply to 
a variation of less than 1 percent, thus insuring uniform exposure of 
the photographic plate. 


II. THE PHOTOGRAPHIC EMULSION AND THE EXPOSURE PROCESS 


The sweep of the writing beam across the photographic plate re- 
sults in a variation of the exposure of the emulsion in a direction 
normal to the scanning direction. If we use the scanning velocity as 
vo and the intensity distribution of the scanning spot as 


I(r) = (3) 


2P ~2r?/w? 
w ° 
where P is the total power in the writing beam and w is the waist 
radius,> then taking the scan to be 2-directed along the line y = yo, 
the variation of exposure in the y-direction is obtained by integration, 
as 


2P —2(y—yo)/w? * —2(vot)2/w? 
Ky) zE € dt, 
TW ae 


‘9 2 —2(y-yo) I/w? 
= NG (4) 
The next line will scan with yp changed by one address spacing and 
the exposure produced by this scan will be added to the exposure of 
the first scan. The total exposure produced by N scans is thus obtained 
by summing N displaced gaussians given by equation (4). 

A similar analysis is used to obtain the exposure resulting from 
modulation of the writing spot. In this case, the beam is turned off at 
x = 0 for each scan. As a first approximation, we assumed the inten- 
sity of the writing spot to decrease with a relaxation time of r = 
d/vo where d can be interpreted as a rise distance in analogy to a 
rise time. The exposure caused by a single trace having the beam 
turned off at « = 0 becomes 


2P —2(y—vYo) 7/w? : —2(2- 2/wt 
E(a, y) = pit 2(y—yo)?/ il é 2(z—-vol) ?/ dt 


4 [ Pi iene eee at| (5) 
0 
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which is evaluated in terms of the error function and its complement.® 


—2(y—vo) 2#/w? 


E is 
(@, y) = Wy) V 2a 


evte (22) 4. comer (eer (2 — v8) 4] 
| exte (2) +. € erf - Ad +1 (6) 

Application of this exposure to a high-contrast emulsion will result 
in the production of a density gradient at the boundaries of the ex- 
posed regions. The greatest magnitude of the gradient will occur very 
close to the contour of 0.5 optical transmission through the developed 
image. The task of determining the actual image formed by the ex- 
posure function of equation (6) is thus reduced to tracing the contour 
of the exposure necessary to produce 0.5 transmission and to evaluate 
the exposure gradient norma! to this contour. A computer program was 
written to evaluate equation (6) over a matrix of points. Table I 
shows some of the results of these calculations. An exposure of 1.00 
is used to produce the 0.5 transmission value. 

For simplest operation, five scan lines or a five-address modulation 
should produce an image five address units in dimension. To obtain 
a best compromise between freedom from mirror facet wobble and 
maximum edge gradient, we chose to operate with a half-power writing 
beam diameter between 1.3 and 1.7 address units (9 to 12 pm). 
Equation (4) can now be used to calculate the beam power required to 
obtain proper exposure on various emulsions. For a spot velocity of 
approximately 16 m/s, 20 »w of beam power will produce a maximum 
exposure of about 120 ergs/cm?’. High resolution plate? requires over 
1000 ergs/cm? for proper exposure. Eastman Kodak Company had an 
emulsion which reached proper exposure between 20 and 100 ergs/cm?, 
although it was not a standard product. This emulsion, called Minicard, 


TABLE I—VARIATION OF EXPOSURE PARAMETERS 


Half-Power Spot Diameter 2.7 | 2.0 Rwy 1.3 2.0 1.7 | 1.8 
Peak Exposure of a Single 

Sean 1.1 1.8 2.5 4.7 0.9 1.1 1.4 
Width for 5-Scan Lines 6.0 | 6.0 6.0 6.0 5.0 5.0 | 5.0 
Gradient (dH/dy) 1.0 1.5 2.0 3.1 1.0 1.2 1.7 
Peak-Exposure for Large 

Number of Scan Lines 2.9 | 3.7 4.5 6.7 2.0 2.0 | 2.0 
Length for 5-Address 

Modulation 6.1] 6.2 6.3 6.5 5.0 5.0 | 5.0 
Gradient (d#/d2x) 0.9 1.3 1.6 2.1 0.8 0.9 1.0 
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was available on special order; Eastman Kodak now produces 8” X 
10” glass plates coated with Minicard emulsion. 

The glass photographic plates must have a very flat emulsion sur- 
face. Fig. 2 is an illustration of the effect of plate camber. The 
emulsion surface will be held near the extremes of the scan line. 
However, plate camber will cause registration errors between plates 
because of the angular scan of the writing beam. The maximum angle 
made by the writing beam and the normal to the photographic plate 
is 15°. To produce less than a one-address-length error between X1 
and X» of Fig. 2, the plate camber must be less than +28 pm. This 
specification is safely met by Kodak microflat plates, but is very far 
from being met by the specifications of lower grades of glass plates. 


Ill. THE ROTATING MIRROR AND SCANNING LENS 


The dimensions of the rotating polygonal mirror are determined by 
the scanning-lens aperture. Since the f-number, field size and field 
angle of the scanning lens have been determined by equations (1) and 
(2), the aperture size is also determined. The facet size of the polygonal 
mirror can be found by geometry, as well as the overall size of the 
polygon. Referring to Fig. 3, the radius of the polygon must be large 
enough to keep the vertices out of the lens aperture during the rota- 
tion producing the scan of a line. 

Since a gaussian illumination of the aperture is being used, the 
full aperture diameter must be larger than that computed from equa- 
tion (2) for a uniformly illuminated lens. A best estimate of satis- 
factory performance with gaussian illumination was f/10 and the 
polygonal mirror was designed not to truncate this aperture during 
the scan. The value of F& for this condition is 9.7 em. The location of 
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Fig. 2—The effect of photographic plate camber. 
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the aperture plane of the scanning lens must lie at the approximate 
location of the mirror facet. To obtain a uniform scanning velocity 
from constant angular velocity of the polygonal mirror, a 6/tan 6 dis- 
tortion was part of the scanning-lens design; 6 is the angle between 
incident collimated light and the lens axis. 

The number of facets on the polygonal mirror determines the ratio 
between the time available for writing and the unavailable time. Since 
the field angle of the written line is 45.4°, 22.7° of mirror rotation 
is spent writing a line. For the decagonal mirror used, 36° of mirror 
rotation is required to go from the start of one scan to the start of the 
next scan. Hence, 13.3° of rotation are unavailable. In order to write 
a complete pattern in 10 minutes, each scan line must be traversed in 
18.8 ms; 11.8 ms writing and 7.0 ms waiting for the next facet to 
come into position. It is during this wait that the photographic plate 
is advanced one address spacing (7 um). 


IV. THE OPTICAL MODULATOR 


The writing beam modulator used is an acoustooptic deflector.’ The 
modulator operates by the interaction of the laser beam with a 50- 
MHz ultrasonic wave in a piece of fused silica. This device deflects 
approximately 2 percent of the power of the incident laser beam at an 
angle of 4 mrad to the incident beam when the modulator is ener- 
gized. Since the modulator is located in a near field region of the 
laser beam, the two beams emerge from the modulator each nearly 
collimated but having angular separation. These beams are then 
passed through a 10-cm focal length lens which transforms the 
angular divergence into a displacement sufficient for physical separa- 
tion. The separation is accomplished by a knife edged mirror which 
has better than a 40-dB discrimination between the beams. 

The 2 percent power in the deflected beam provides more than 17- 
dB on-off ratio and is limited by back reflections and scattering. 


2040 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1970 


However, this is sufficient for the writing-beam modulation. The un- 
deflected beam is used as the coding beam. The modulator has a rise 
time of less than 200 ns, including the transistor drivers. The transducer 
is X-cut crystal quartz. 


V. MODE-MATCHING OPTICS 


A series of lenses are required to transform the output mode of the 
laser to modes required for the modulator and then to the modes 
required by the scanning lens. The output of the laser is limited to 
a TEMoo mode by use of an aperture within the laser cavity. The 
calculation of the positions and focal lengths of the required trans- 
forming lenses was done using the method described by H. Kogelnik.® 

The first transformation is between the laser output and the optical 
modulator. The modulator requires a 300-»m waist radius in the 
fused silica. In turn, this mode is transformed to a 55-ym waist lo- 
cated at the knife-edged separation mirror. The writing beam is trans- 
formed to approximately a 9-~m waist radius at the object focal plane 
of the scanning lens and the proper writing spot is produced. The code 
beam is transformed by a pair of lenses. The first produces a mode 
having a waist radius of 800 »m, an essentially collimated beam for 
the 50-cm distance to the code plate. The second lens is a cylindrical 
lens which produces a 4-ym waist radius in one direction and does 
not change the 800-ym waist radius in the perpendicular direction. 
This slit-shaped spot is imaged by the scanning lens to a slit spot on 
the code plate. 


VI. THE CODE PLATE 


The code plate is a ruled grating having approximately 13,300 
cycles. Each cycle consists of a 7-~m opaque region and a 7-ym 
clear region. The slit shaped coding beam is focussed in its narrow 
dimension to best resolve the grating. The long dimension of the beam 
is aligned to the ruling direction of the grating. In this manner, small 
defects in the grating, dust specks and pinholes do not significantly 
affect the code-plate system. 

The coding beam will traverse the full-field angle over the scan of 
a line. In order to collect the coding beam onto a photodetector after 
it has passed through the code plate, a Fresnel lens is positioned be- 
yond the code plate (see Fig. 1). This lens images the aperture of the 
scanning lens onto the face of a photomultiplier tube. The sensitivity 
of this device is required so that the coding beam can be attenuated 
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by approximately 20 dB before it illuminates the scanning lens. If 
this attenuation is not used, then the scatter from the intense coding 
beam fogs the photographic plate and reduces the modulation capable 
of being obtained with the writing beam alone. 

The processing and use of the code plate output is described in 
Part I1I—The Control System. The alignment of the code plate for 
production of an accurate scan is described in Part IV—Alignment 
and Conclusions. 
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I. INTRODUCTION 


The primary pattern generator (PPG) is an electromechanical 
light-scanning system with an unusual combination of speed and 
accuracy. A 10-»m-diameter light spot can be addressed successively 
to any or all points of a 26,000-wide by 32,000-long rectangular point 
array with 7-um vertical and horizontal spacing in about ten minutes. 
This corresponds to a scanning rate of one spot per 600 nanoseconds. 
The light spot is placed repeatedly to an accuracy of about a £7-ym 
total accumulated error over the whole array, and the vertical and 
horizontal spacing between points is maintained within +1 pm. 

The rectangular point array is scanned one line at a time at the rate 
of 53 lines per second by successive sweeps of a monitored laser beam 
across the width of the array interposed by 7-»m steps of the photo- 
graphic plate in the perpendicular direction. The essential components 
of the scanning system are shown in Fig. 1. The laser generates a 
light beam which, by various stationary mirrors, is directed to the 
acoustooptic modulator. When this modulator is turned on, a small 
portion of the laser beam is slightly deflected and is denoted the 
write beam. The major portion of the light beam, called here the 
code beam, passes through the modulator with no directional change. 
When the modulator is turned off, the light beam passes through un-. 
changed. The response time of the modulator is of the order of 10 
nanoseconds which is very small compared to the 600-nanosecond 
period it takes the scanning beam to move from one addressable point 
to the next. 

By further fixed mirrors and lenses, the code and write beam is 
brought to focus at neighboring points near the edge of the photo- 
graphic plate. 
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Fig. 1—Primary pattern generator. 


By means of the scanning lens and the decagonal mirror, the focused 
spot of the write beam is imaged onto the photographic plate. The 
focused point of the code beam is, by the same means and one addi- 
tional code beam mirror, imaged onto a code plate. The code beam is 
intercepted by the code plate except at 7.0-~m-wide transparent lines 
on 14-»m centers. The light passing through these transparent lines 
is collected in a photocell by means of a Fresnel lens. As the decagonal 
mirror turns, the two beams move together. The code beam, by pulsing 
the photodetector, yields positional information to the computer which, 
by means of the modulator, regulates the write beam on or off as 
required for proper exposure of the photographic plate. 

The decagonal mirror spins at 300 rpm resulting in 53 write-beam 
sweeps per second. The 10 facets of the decagonal mirror are inclined 
to the mirror’s radial symmetry axis at a very small angle which 
is identical for all facets within +4 of one second of arc. Furthermore, 
the mirror’s radial-symmetry axis spins with a wobble less than 1/10 
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of one second of arc. Therefore, any sweep of the write beam when 
the photographic plate is fixed traces lines that are separated by no 
more than +3 pm. 

The 300-rpm speed of the decagonal mirror results in about 11-ms- 
duration sweeps across the photographic plate, with about a 7-ms-long 
period between the end of one and the beginning of the next sweep. 
The computer may write in every sweep, and the step system must 
be designed so that a step may be completed in the 7-ms period be- 
tween sweeps. If the computer writes in every sweep, the table steps 
at 53 steps per second. If the computer cannot write in every sweep, 
one or more steps are skipped as required for the computer to catch 
up. This step motion is a sophisticated vibration-free one where each 
step is equal to the next within +4 pm, and the total accumulative 
error over 32,000 steps is about +5 um assuming temperature control 
within 0.2°C. 


II. MATERIALS SELECTION 


The material used for the major PPG structure is Meehanite GC40. 
This material was chosen for its great dimensional stability with time. 
To insure that the material was initially stress-free, a three-step heat 
treatment-machining sequence was used. Briefly: 


(t) After casting 
(a) Heat to 1600°F. Hold 2 hours. 
(6) Cool to 1250°F at 35°F per hour. 
(c) Hold at 1250°F for 10 hours. 
(d) Cool to 200°F at 20-25°F per hour. 


(zz) After Rough Machining (allow 0.020” for final machining) 
Thermally cycle: 210°F to 400°F to —120°F to 400°F 
to 200°F. Hold at —120°F and 400°F for 2 hours. 
Final cooling to 200°F must not exceed 25°F per hour. 


(2) After Dual Machining 
(a) Heat to 300°F. Hold for 6 hours. 
(b) Cool to 200°F at 20-25°F per hour. 


The residual stress after heat treatment will not exceed 200 psi, result- 
ing in a maximum relaxation strain of about 10 microinches per inch. 
Micro-creep tests conducted at Battelle Institute indicated that most 
of this relaxation occurs in the first four to six weeks which is before 
assembly of the pattern generator. Thus, only a few microinches per 
inch is expected during the life of the pattern generator. 
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III. TWO SPECIAL AXIAL ALIGNMENTS 


Two very accurate axial alignments are made in the pattern genera- 
tor. In one, the axis of an air bearing is aligned with the radial- 
symmetry axis of the decagonal mirror. In the other, the axis of the 
air bearing is aligned with the direction of motion of the step table. 
Both alignments use an elastic micromanipulator which was developed 
especially for the pattern generator. The alignments are essentially 
identical and only the decagonal mirror alignment is described here. 


3.1 The Elastic Micromanipulator 

The elastic micromanipulator is based upon a very elementary 
mechanical deamplification device. It consists of two springs that are 
connected in series and deflected against a support. In the static case, 
the total deflection of the spring, A8,, is related to the deflection of the 
interface of the springs, Aés, by the relationship 


k 


— 1 _ 
ae ky + ke . 





Ab, 


where ky and kz are the respective spring constants. The motion, A8,, 
is thus directly related to AS. by the deamplification factor F = k,/ 
(k; + ke), which can be made as small as one pleases by choosing 
ky > ky. In order to use such a device as a micromanipulator, one pro- 
vides a fine screw to manually produce the deflection, A8;, and one 
attaches the body to be moved to the spring interface so that the 
corresponding body motion is Adz as shown in the lower part of Fig. 2. 


3.2 Alignment of the Decagonal Mirror 


The adjustment for axial alignment of the decagonal mirror consists 
of three elastic micromanipulators placed 120° apart and equidistant 
from the symmetry axis. Between the face of the air-bearing spindle 
and one side of the mirror are the three stiff springs and, on the other 
side of the mirror directly opposite to these stiff springs, are the three 
soft springs which can be pressed against the mirror individually by 
three fine-adjusting screws. 

The nature of the three stiff springs requires some explanation. The 
air-bearing face is machined with three raised #” X 3” areas as indi- 
cated in Fig. 3. The surface of these areas is finished machined with a 
stationary tool when the air bearing is spinning so that their surfaces 
lie in a plane normal to the air-bearing axis within about a second of 
arc. The decagonal mirror has an optically flat end face and this face 
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Fig. 2—Elastic micromanipulator. 


is placed directly against these three pads. As the mirror is pressed 
against these raised areas by the soft springs on the opposite side of the 
mirror, the pads elastically indent the mirror as indicated on the upper 
part of Fig. 2. There is also some corresponding local indentation of 
the air-bearing face. Except for these small local regions of deforma- 
tion, the mirror and the air bearing remain essentially rigid and the 
elastic deformation in the three small regions serves the purpose of the 
three stiff springs. The various mechanical elements are shown in detail 
in Fig, 4. 

The relationship between the force exerted by the soft springs and 
the corresponding deflection of the stiff springs can be worked out from 
a classical elasticity solution due to J. Boussinesq. From this solution 
one can determine the effective spring constant associated with each of 
the three stiff springs. They are given approximately by 


ky = 2-10° lb/in. 
The soft springs on the opposite side have a spring constant given by 
k, = 6-107 lb/in 
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Fig. 3—Air-bearing spindle with raised areas. 


and the amplification factor, F, works out to be about 
F = 3-10™*. 


The pitch of the adjusting screws is 40 turns per inch, and thus for one 
complete revolution of the adjusting screws the mirror will move about 
18-10 microns. When only one adjusting nut is advanced, the mirror 
will rotate about an axis passing through the two raised areas opposite 
the other two adjusting nuts. The raised areas are separated by about 
7 cm, and thus the resulting rotation of the mirror equals about (0.6) 
second of arc per revolution of the adjusting nut. Since the adjustment 
is carried out together with an instrument to measure the mirror axis 
run out, there is no need to know this relationship exactly. 

In the PPG, the mirror axis is aligned with the air-bearing axis to 
zy of a second, and we know that the adjustment remains stable to this 
accuracy over long periods. 

When the mirror facets are measured to perform the final grinding 
operations, the mirror is aligned to %o of a second. This more precise 
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adjustment has been demonstrated to be stable over several days, but 
it has not been evaluated on a long-term basis. 


IV. THE STEPPING SYSTEM 


There are two simple and fundamental concepts involved in the 
pattern-generator stepping system. One of these is a special electronic 
drive for the step motor used in the mechanical drive of the stepping 
system. The other is tuning of the natural frequency of the second 
mode of motion of the mechanical drive. Together these two concepts 
permit vibration-free stepping in the absence of passive damping. 
There are also several practical problems involved in the construction 
of the step table. One describes here first the two simple concepts, next 
the problems of construction, and last some experimental results. 


4.1 The Special Electronic Drive 

In order to describe the special electronic drive, first one describes 
certain characteristics of the stepping motor. The motor torque, 7’, as 
a function of the angular position of the armature, 0, is shown in Fig. 
5a for a given current in the two motor windings. The amplitude of 
the sinusoidally varying torque is called the holding torque. The hold- 
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Fig. 4—Telescopic view of the decagonal mirror adjustment. 
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ing torque is proportional to the current in the motor windings. The 
magnitude of this current is usually kept constant, and only its direc- 
tion is changed in the normal operation of the stepping motor. The 
effect of successively changing the direction of the current in each 
motor winding is indicated in Fig. 5b. 

The mechanics of a simple operation of a stepping motor are essen- 
tially as follows: Assume the motor to be at rest in step position, n, 
which is one of the stable-equilibrium positions associated with the 
motor torque indicated by the solid curve in Fig. 6. Let the current be 
changed in one winding, thus bringing about the motor torque indi- 
cated by the dotted curve. The motor will now accelerate towards the 
step position n + 1 and, depending upon the damping in the motor, 
assumed less than critical, it will vibrate about the new position with 
decaying amplitude. This vibration is completely intolerable for the 
present application. Furthermore, if the motor is stepped continuously, 
vibration build-up from one step to the other occurs. To eliminate the 
vibration, the motor is provided with a special electronic drive. This 
drive provides three timed current settings for the motor per step 
which are applied as follows: Assume as before that the motor is at 
rest in position n as indicated in Fig. 6. The current is now reversed 
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Fig. 5—Characteristics of the at Pome | motor. (a) 0, 1, 2, , N, are the step 
positions of the motor. 2, 6, , (2 +. 41), where J is an integer, are equilibrium 


positions of the motor for a sporiiealae current direction in the two-motor wind- 
ings as schematically indicated by the arrows in the ellipse on the right. (b) The 
motor torque as a function of theta is simply translated by one step each time 
the current is reversed in one winding. 
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Fig. 6—Motor torque, 7, as a function of angular position, 6. 


in one winding for a timed period, ¢,, bringing about the motor torque 
indicated by the dotted curve. As before, this will accelerate the motor 
towards its new step position, n + 1. However, t; is adjusted such that 
at the end of this timed period the motor is at a point about half way 
between n and n + 1, and it is of course still moving. The current is 
now reversed again in the same winding for another time period, fa, 
bringing about the torque indicated by the solid curve in Fig. 6. This 
torque decelerates the step motor until it stops, and ¢, and ts are timed 
such that the point at which the motor stops coincides with the new 
step position, nm + 1. The current in the same winding is now reversed 
a third time, producing the motor torque indicated by the dotted curve. 
This third current setting will hold the motor in the new equilibrium 
position until one wishes to make another step. This stepping technique 
produces vibration-free stepping without passive damping. Such an 
electronic device has been used previously in Bell Laboratories for a 
magnetic tape drive. 


4.2 A Tuned Two-Degrees-of-Freedom System 

In the previous description of the special motor drive it was tacitly 
assumed that the stepping motor and all that it drives behaves as a 
single-degree-of-freedom system, i.e., that the motion of all bodies in- 
volved can be determined from a single independent variable. This 
state exists if such things as backlash, elastic deformation of parts, 
etc., are negligible. If the time to complete a single step is made suffi- 
ciently long, say by decreasing the motor torque, our step system will 
behave sensibly as a single-degree-of-freedom mechanical system in- 
volving only rigid-body motion. However, if the time to complete a 
step is made short enough as was the case in the pattern generator, 
one will also excite noticeable motion involving elastic deformation in 
components of the system. One is then confronted with a much more 
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complicated multidegree-of-freedom mechanical system. Specifically, 
there was one deformational mode of motion that could not be elimi- 
nated. The special motor drive does not then by itself yield vibration- 
free stepping. One describes here how we were able to control this 
deformational mode by tuning its natural frequency. 

The stepping table is shown in Fig. 7. It consists of a stepping motor 
driving the shaft of a ball-lead screw, a thrust bearing preventing axial 
motion of the shaft relative to the rigid base, a step table on linear 
roller bearing ways and driven by the nut of the lead screw. There 
are two modes of motion that come into play in this stepping system: 
(7) The motion in which all bodies remain rigid and involving shaft 
rotation and linear table motion as constrained by the lead-screw pitch. 
One denotes this mode the ideal rigid-body mode. (27) The mode of 
motion where the table, as in the first mode, moves as a rigid body on 
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Fig. 7—Stepping system, 
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its ways but now as a result of elastic deformation primarily of the 
Hertz type that occurs in the balls and races of the lead screw. 

A simple analysis of this two-degrees-of-freedom system reveals an 
interesting characteristic, namely, that by an adjustment of the natural 
frequency of the second mode of motion, the special electronic motor 
drive will step the table with no vibration in either mode. Subsequent 
experiments proved that such mechanical tuning is a practical matter. 
In order to describe the essential mechanics involved, some aspects of 
the simple analysis are given here. 

Because of special mechanical characteristics of the step table the 
two modes of motion mentioned above, namely, the ideal rigid-body 
mode and the mode involving deformation in the ball screw, are very 
nearly the normal modes of the system. Therefore, the shaft rotation 
under the action of the motor torque is sensibly unaffected by the 
elastic deformation in the ball screw and can be calculated quite ac- 
curately, taking only the rigid-body mode into account. The second 
mode of motion can be equally accurately calculated, taking it to be 
a single-degree-of-fredom system whose support is given an inexorable 
motion identical to the table motion associated with the rigid-body 
mode. The equations for this determination of the first and second 
mode are 


é + # = —w's, 


where TJ is the motor torque, J is the sum of the rotatory inertia of 
the motor and lead-screw shaft plus an equivalent table rotatory inertia, 
6 is the angular position of the motor, 2, is the first-mode table motion, 
x is the second-mode table motion, p is the lead-screw pitch, w is the 
circular natural frequency of the second mode, and dots indicate time 
derivatives. One assumes now first that the motor torque is a constant 
over the acceleration period ¢, and the same constant with negative sign 
during the deceleration period t, . Secondly, one assumes ¢; = ¢, and 
that the constant torque is selected so that % is zero when 2p is in- 
creased by one step, i.e., the special drive is adjusted to give no vibration 
in x at the end of a step. Lastly, one assumes that x and « are zero at 
the beginning of a step. One obtains then for the amplitude of vibration 
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in the second mode, A, 


_ sin’ (rfi/2) 
A 0 hy? 

where Z, is the length of one step, f = w/2m, and # = t, + t,. One notes 
now that A = 0 when ft/2 is an integer. According to this simple analy- 
sis, there should be no vibration if f = 286 eps when ¢ = 7-107* s as 
required in the pattern generator. This frequency corresponds closely 
to the frequency determined both experimentally and from a more 
rigorous numerical analysis at which vibration was found to vanish. 
The vibration amplitude, A, is plotted in Fig. 8 as a function of f. This 
curve reveals another important point, namely, that where A is zero, 
the slope of the curve is also zero. For that reason, there is no need to 
adjust the frequency of the second mode accurately to effectively elim- 
inate vibration, which would have been impractical. One notes that the 
above solution applies to continuous stepping only when ft/2 is an 
integer since only then are a and # zero at the beginning of each step. 
If vibration in x occurs, one has to contend with vibration build-ups 
from one step to the next. 

The rigid-body mode, f = ©, is plotted together with the actual table 
motion in Fig. 9. The difference between these curves is essentially due 
to motion in the second mode. One notes that the second mode, as the 
first, is excited only during the times ¢, and f,, and no subsequent 
motion occurs until the table is stepped again. 


4.3 Some Practical Problems of Construction 

Several problems were encountered in the construction of the step 
table to make it, in fact, behave as the two-degrees-of-freedom system 
analyzed, A major problem was to reduce the number of degrees of 
freedom of the system to two. This was done by increasing the natural 
frequency of the various other modes to a point where the step motion 
would not noticeably excite them. Our effort in this respect is reflected 
in the very massive and stiff structure of the pattern generator. 

Of particular interest also is the very massive support for the thrust 
bearing, noticeable in Fig. 7. A particular thrust bearing was selected 
which enabled us to get rid of a very objectionable third mode of 
motion in which the ball-screw shaft would move axially by elastically 
deforming the thrust bearing and its support. A very difficult problem 
was to find lead screws with a combination of high stiffness of the nuts 
axial deformation relative to the shaft and low-frictional torque. We 
found ball-lead screws to be far superior in this respect to lead screws 
with acme threads. 


PPG MECHANICAL DESIGN 2055 





0.2 


ie _--t=7-1072 sec 

wi 0.8 “a 

S 

Fig 

i wy O68 

=|N e 
«15 A _ sin2 (rf t/2) 
2 Ss = Se ees 
Oth 0,4 Lo 

FIE 

Fale) 

jag 

o 

> 





—— | 4 
ie) 100 200 300 400 500 600 
FREQUENCY-SECOND MODE, CpS 


Fig. 8—Vibration amplitude, A, of the second mode as a function of its fre- 
quency for a fixed step time, ¢ = 7 ms. 


4.4 Lead Screw Life Tests 


One of the most critical mechanical requirements of the PPG is 
that the drive train of the system have a sufficiently long life so that 
many years of product can be made without changing essential items 
which would affect the reproducibility accuracy of the system. One 
sees from Fig. 8 that a drive train-table combination whose stiffness 
yields a frequency of about 280 cps is desirable. To insure step accu- 


TABLE MOTION IN MICRONS 





Fig. 9—Ideal rigid-body mode, f = ©, superimposed on the actual table motion, 
f = 286 cps. The discrepancy between the two curves is very nearly the motion of 
the second mode. One notes that both modes are excited during t: and te, but no 
motion persists in either mode once a step is completed. 
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racy, it was desired that the stepping-system stiffness be great enough 
to yield a frequency of 280 cps and the frictional torque should be 
considerably lower than the stepping-motor holding torque so as to 
minimize step error due to friction. A preload of 25 pounds on the ball 
screw. was found to yield the desired system stiffness and torque to 
break static friction. 

The test setup used to establish the life test of the mechanical com- 
ponents of the drive train is shown in Fig. 10. The life-test setup 
duplicates the essential features of the PPG drive train. 

The status of the life-test equipment was monitored by periodically 
checking the torque to break static friction and the stiffness of each 
system. The stiffness was measured by determining the rigid-body 
resonant frequency of the drive train-table combination and then cal- 
culating the stiffness. The stiffness was also checked occasionally by 
statically measuring the drive-train stiffness by applying a known load 
and measuring the table deflection relative to the thrust-bearing sup- 
port. 

‘One sees from Fig. 11, which is typical of the data taken, that there 
has been a pattern of decreasing torque-to-break static friction. Simi- 
larly, from Fig. 12, the stiffness measurements for the units have shown 
a tendency to increase with time. 
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Fig. 10—Typical life-test setup. 
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During the life test, a decrease in torque and an increase in stiffness 
can be attributed to the fact that the screw and bearings are being 
burnished (i.e., worn in) and hence, the riding surfaces are more 
uniform and smoother. Furthermore, as things become smoother, more 
balls of the ball screw and needles of the thrust bearing become fully 
effective. 


4.5 Stepping Test Measurements 


The accuracy of the step table as determined experimentally is 
briefly as follows: Steps are reproducible to =4%4 pm. This reproduci- 
bility accuracy is primarily the result of some unavoidable coulomb 
friction in the drive and a small amount of vibration about the equilib- 
rium position. The absolute accuracy of steps is such that all steps are 
equal within +14 pm. 

Experimental determination of the table motion as a function of time 
is given in Fig. 18. 

Straightness of table travel with minimal transverse and rotary mo- 
tions is necessary to achieve reproducibility of spot positions on the 
photographic plate. A table mounted on preloaded roller bearings was 
employed to achieve the required accuracy. Measurements showed that 


320 
7 COMBINATION GARLOCK NADELLA 
\. AXIAL RADIAL THRUST BEARING 


\ (5.5 IN-LB PRELOAD) 
\ \ a lerst a a TABLE 

240 --—PYE -LING 

wore VIBRATOR 
poo|__ | DISTANCE TRAVELED 

eich ACCELEROMETER 

2. TIME PER CYCLE = 190 SEC ‘ 

160 3.AXIAL STIFFNESS MEASURED 2R3.5MM ROTAX SCREW 


AS A FUNCTION OF FUNDA- (25LB NOMINAL PRELOAD) 
MENTAL FREQUENCY OF 
SYSTEM 


{20 


AXIAL STIFFNESS (K)~(LB/IN) (1073) 
@ 
[o} 


40 





NUMBER OF CYCLES RUN (1074) 


Fig. 12—Axial stiffness of 2R, life-test setup versus cycles run. Note: 1. Distance 
traveled per cycle = 19”; 2. Time per cycle = 190 s; 3. Axial stiffness measured 
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Fig. 13—Table displacement as a function of time. In the above figures, the 
table displacement was obtained with a laser interferometer having its digital 
output converted to an analogue output. The scale of the horizontal axis is 2 ms 
per division, and the vertical axis is 1.34 um per division. Nominally, the table is 
to step 7 wm in 7 ms. The very small steps noticeable in the curves are single 
counts of the laser interferometer representing a displacement of about 0.079 um. 
The first two curves each show a single step. The difference between them shows 
the effect of variations in friction and axial stiffness along the length of the ball 
screw. The third figure shows two successive steps. The discrepancy between them 
represents error introduced by the stepping motor. The fourth curve shows 50 
successive steps. 





the rotational motion superimposed on the translational motion was 
less than 10 seconds of are and that the transverse motion was about 
one micron. 
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Device Photolithography: 


The Primary Pattern Generator 


Part I1I-The Control System 


By P. G. DOWD, M. J. COWAN, P. E. ROSENFELD and 
A. ZACHARIAS 


(Manuscript received July 10, 1970) 


I. INTRODUCTION 


The primary pattern generator (PPG) writing-control system has 
two main functions: (7) interpret the commands generated by the 
XYMASK PPG postprocessor, and generate from these commands a bit- 
by-bit image of a scan line and stepping-table control; and (i) check 
the operation of the PPG system. 

The interaction between the PPG and the writing-control system 
must take place in synchronism with the rotating mirror on the PPG. 
The writing beam moves continuously across the photographic plate; 
once a scan has begun, a complete line must be written. One task of 
the control system is to assemble completely the bit image of a line 
in a buffer before the start of that scan line. Each line consists of 
26,000 bits which must be taken from the buffer in a serial fashion 
in synchronism with the writing-beam position. A line is scanned in 
approximately 12 ms; hence, the bit rate during the writing period is 
2.2 Mb/s. We thus require real time interaction with a nonstop 
mechanical device operating at electronic speed. 

A complete pattern requires exposure of 32,000 scan lines or approxi- 
mately 10° bits. Accurate operation requires a high degree of system re- 
liability and thorough checking of operations. One check uses parity 
data generated in the xyMAsk PPG postprocessor and regenerated from 
the signal input to the optical modulator of the PPG. Other checks are 
on the interface between the control system and the PPG. These checks 
monitor the operation of the electronics; they proved valuable during 
fabrication of the system. 


Il. THE CODE-PLATE SYNCHRONIZING SIGNAL SYSTEM 


A block diagram of the computer control system is shown in Fig. 
1. The code-plate synchronizing signal is obtained from the photo- 
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Fig. 1—Block diagram of electronics. 
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multiplier (PMT) used as the code-beam detector. The PMT out- 
put consists of a periodic signal superimposed on a level change. 
When the code beam begins its scan across the code-plate grating, the 
output level of the PMT changes with a relaxation time of approxi- 
mately 400 ns. Similarly, when the code beam finishes its scan and 
leaves the code-plate grating, the PMT output level returns to zero 
with the same time constant. The periodic signal superimposed on this 
average level change represents a modulation index of approximately 
4. However, as is seen in Fig. 2 the average amplitude of the PMT 
output changes quite significantly over the length of the scan. Since 
we are solely interested in the phase of the periodic component, a 
limiter with controlled AM to PM conversion is employed before phase 
detection. The limiter will necessarily drop out during the dead period 
of each scan. The resulting noise into the phase detector will be unac- 
ceptable and so a silencer must be employed. This is accomplished by 
a gate which is opened and closed by the level changes occurring at 
the start and end of the scan. The laser power can be changed by a 
factor of five without affecting the processed output. 

Referring again to Fig. 1, the processed output from the code-plate 





Fig. 2—Code-plate synchonizing signal. (a) Raw signal from the photomulti- 
plier tube showing the entire scan, X sweep = 12 ms/box. (b) An expanded view 
showing the start of the track, X sweep = 2.2 us/box. (c) Output of the limiter 
showing the entire scan, X sweep = 1.2 ms/box. 
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synchronization system is fed to the interface between the PDP-9 
control computer and the PPG. The functions of the blocks are best 
explained by following the sequence of steps that occur when one line 
is written. Before the line can be written, the bit-by-bit image of that 
line must be assembled in a core buffer in the PDP-9. 


III. THE DATA INPUT SYSTEM AND CONTROL COMPUTER OPERATION 


The data read by the PDP-9 control computer is a sequence of 
magnetic-tape records each of which is 1625 PDP-9 words in length. 
These records are of two types: one type contains a series of operation 
commands which define changes to be made to the current scan-line 
buffer in producing the succeeding scan line. Consecutive lines normally 
do not differ appreciably in their makeup. Therefore, only a few com- 
mands can update a scan line and thus the updates for many scan 
lines can be held in one record. The second type of input record is used 
for those instances in which a great number of update commands 
would be required to produce the succeding scan line. When this con- 
dition arises, a record is produced which contains all of the 26,000 
bits for the new scan line rather than the update commands which 
would be required to produce that scan line. 

Within the PDP-9 are four buffers of 1625 words each; one buffer 
holds the current scan-line data and another is the current-command 
buffer. The other two buffers allow for the overlapping of tape-reading 
operations with the updating and outputting of scan lines. Therefore, 
when a new scan-line buffer is requested or the current-command 
buffer is exhausted, outputting or processing of the next buffer can 
begin immediately. In the scan-line buffer, the rightmost 16 bits of 
each 18-bit PDP-9 word are used to designate whether the laser beam 
should be turned on or off at each of the 26,000 address locations. The 
magnetic-tape-handling operations were facilitated by making the 
update-command records the same length as the scan-line buffers. 

The following is a brief description of the operation codes used to 
update the scan-line buffer. 


(z) Change word WN in the scan-line buffer in such a way that a speci- 
fied, single transition from ‘‘beam on” to “beam off’ or vice 
versa occurs. An index to the 32 possible single-transition words 
is used; this allows the word address and the index number to 
be packed into one PDP-9 word. 

(iz) Change M consecutive words to all zeros. 
(iz) Change M consecutive words to all ones. 
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(iv) Change word M to a specified bit-by-bit configuration. This 
covers changes entailing more than a single transition. 
Replace the current scan line with the line described in the next 
- magnetic-tape record. 
(vt) Write N scan lines identical to the last one. 
(vit) Skip N scan lines. This allows the rapid coverage of blank areas 
of the pattern. 
(vitt) Write N consecutive lines of all ones. 


(v 


~~ 


In addition to the scan-updating commands, the input tape contains 
control commands which direct the operation of the PDP-9 control 
program. These control commands cover such information as: (2) the 
file number of the pattern information on the tape reel; (zz) the total 
number of scan lines in the pattern; (diz) addresses for locating the 
repeat sections of a scan line; (iv) end of update commands for the scan 
line; (v) the number of horizontal repeats in a scan line; and (vz) identi- 
fication of the last line of a pattern. 


IV. THE INTERFACE BETWEEN THE CODE-PLATE SYNCHRONIZING SYSTEM 
AND THE CONTROL COMPUTER 


When the computer has finished assembling a scan line, it loads 
the starting address of the scan-line buffer into both the Repeat Ad- 
dress Register (RAR) and the Scan Address Register (SAR) (Refer 
to Fig. 1). It also sends signals to the break control and track detector 
telling them that a line may be written. The break control causes a 
word having address specified by the SAR to be fetched from memory 
and placed in the buffer register. The SAR is incremented by one so 
that it now points to the next word in the scan-line buffer. When the 
track detector finds the start of the timing track, it opens a gate and 
allows timing pulses to pass to the 17-bit shift register and the divide- 
by-sixteen counter. Each timing pulse causes the bits in the shift 
register to be shifted right one place. The output of the last stage of the 
shift register is used to control the laser writing beam, turning it on 
if it is a “one” and off if a “zero.” The divide-by-sixteen counter pro- 
duces an output pulse at each 16th timing-track pulse. This pulse causes 
the contents of the buffer register to be transferred to the shift register. 
This pulse also causes the break control to fetch another word from 
memory and deposit it in the buffer register. The line density counter 
counts the number of “ONES” that are shifted out of the shift register. 
This count is used in error checking. 

The process of transferring words from memory continues until a 
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word which contains a “ONE” in either bit position 0 or 1 is loaded 
into the buffer register. A “ONE” in bit 0 signals that the portion of 
the line being written is to be repeated. Therefore, the contents of the 
RAR are transferred to the SAR and the next word fetched by the 
break control will come from the location in the scan-line buffer speci- 
fied by the RAR. A “ONE” in bit 1 signals that this is the last word in 
the scan-line buffer for this scan line. The break control logic is 
disabled and ignores any further pulses from the divide-by-sixeteen 
counter. In addition, the control program is notified that the end of 
line has been reached; the track detector notifies the program that 
the end-of-track has been reached when that event occurs. 

Anytime after the end-of-line is reached, the control program can 
command the carriage to be moved. This is done by transferring a word 
to the carriage control logic that specifies how many steps the carriage 
is to be moved. If the carriage is to be moved one line, the carriage 
control logic will cause the stepper motor driver to deliver the sequence 
of steps required to cause the carriage to move and be stopped within 
the 7 ms allowable time. Thus, if the next line is assembled in the 
core buffer of the PDP-9, the line will be written by the succeeding 
mirror facet. If the carriage is to be moved more than one line, then 
the number of lines less one to be moved must be all blank, and so 
carriage motion can be carried out asynchronously at high speed. After 
the last line is stepped, synchronism is regained by the operation of 
the track detector. The last line is always output by the carriage con- 
trol logic as if only one line were to be moved. This effectively stops 
the carriage 7 ms after the last line command is issued. 

The remote control buffer is used to provide communication between 
the operator and the computer. It consists of a flip-flop register and 
lamp drivers for signaling the operator, and gates to allow the com- 
puter to sense the pushbuttons the operator uses to signal it. 


V. ERROR DETECTION 


There are a number of safeguards in the control program which 
check on both hardware and software types of errors. Error detections 
are transmitted to the operator via teletype and light signals. Most 
errors are fatal and necessitate the restarting of the pattern. When this 
type of error is encountered, the current run is aborted and the photo- 
graphic plate is unloaded from the machine. Some errors occur before 
the pattern is begun and, in these cases, the operator is advised but 
no unloading takes place, 
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Most hardware errors are detected in two major ways. One is a 
count of the number of pulses the code plate synchronizing signal sys- 
tem send to the track detector. If this deviates by more than +1 pulse, 
then a fatal error is detected. Another track check is the occurrence 
of end of track before the end-of-line word has been written. The 
second way that hardware errors are checked is by the line density 
counter. If some malfunction occurred, then the number of ONES in 
the line written will not agree with the control command specifying 
the number of ONES in that line. This line-density count checks not 
only the functioning of the interface hardware, but also the PDP-9 
assembly of the line image in the scan-line buffer. Other hardware 
errors are checked by comparing the SAR value at the end of a scan 
line with the value the SAR should have after the line is output. The 
carriage control is checked by comparing the reading of a shaft encoder 
on the stepper motor with the required reading after the pattern is 
completed. This shaft encoder gives an indication of its position only 
once in 500 lines, so continuous monitoring is not feasible. However, if 
the reading at the end of the pattern is not correct, then indication of 
a carriage error is given to the operator. 

Software and magnetic-tape errors are detected by program routines 
in the PDP-9. Illegal update commands, magnetic-tape reading errors 
and other magnetic-tape controller errors are the main errors detected 
by these routines. 


Device Photolithography: 


The Primary Pattern Generator 
Part [V-Alignment and Performance 
Evaluation 


By A. M. JOHNSON and A. ZACHARIAS 
(Manuscript received July 10, 1970) 


I. REQUIREMENT FOR ALIGNMENT 


The mechanical nature of the primary pattern generator (PPG) 
requires a precise juxtaposition of most of the machine elements in 
order to achieve both pattern accuracy and reliable functioning of the 
machine. Part II described the alignment of the rotating polygonal 
mirror to the air-bearing axis. The precision required in that assembly 
is the tightest tolerance in the PPG. This precision is required to pro- 
duce a uniform scan-line spacing on the pattern. In addition, the direc- 
tion of that scan line must be made as perpendicular as possible to 
the travel direction of the photographic plate. Therefore, the carriage 
of the photographic plate must move without rotation. The method 
for aligning the polygonal mirror axis to the carriage direction will 
be described, as well as other alignment needed to produce an accurate 
pattern. The code-plate system for controlling the fast scan was de- 
scribed in Parts I and III. Implicit in this description was the assump- 
tion that the code-plate grating and the photographic plate are the 
exact same distance from the scanning lens (see Fig. 1 in Ref 1). The 
positioning of the code plate to achieve accurate length of the fast. 
scan is a critical alignment that requires a combination of optical 
and electronic techniques. 

The accuracy goal for the PPG was 100 parts per million (ppm) 
deviation from an absolute coordinate system, the error reference being 
the overall dimension of the full PPG field. Thus the coordinate axes 
of the pattern must be othogonal to within 20 seconds. A second of are 
is approximately 5 x 10° rad. The photographic-plate position is de- 
termined by a lead screw as described in Part II. The accuracy of this 
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screw is the determining factor in the overall length error of the plate 
translation axis. For convenience, we will refer to this axis as the Y- 
axis and the fast scan axis as the X-axis. 

The functional alignment includes positioning the optical modulator, 
obtaining separation of the coding and writing beams, positioning of 
the scanning lens, and positioning of various other lenses and mirrors 
in the optical paths of the two beams. The design and alignment of 
the laser cavity is described. The long-term functioning of the PPG 
will require replacement of the laser discharge tubes. Our design-and- 
alignment procedure allows tube replacement without realignment of 
the remainder of the optics. 


II. FUNCTIONAL ALIGNMENT 


The quartz laser tube is clad with a water-cooling jacket and is 
rigidly mounted within a solenoid which provides the axial magnetic 
field. By placement against pins, this assembly is located precisely on 
a flat plate on which the cavity mirrors are rigidly mounted. This sys- 
tem was devised so that a remotely located reference cavity can be 
used to prealign a laser-tube-solenoid assembly to the laser cavity on 
the PPG. The use of the reference cavity significantly reduces the down 
time of the PPG during laser replacement; replacement of the laser 
does not require realignment of the PPG. 

The laser cavity is of a nearly hemispherical configuration consisting 
of a 0.9-m radius highly reflecting mirror and a flat, transmission 
mirror at the output. The separation is 0.75 m. The output is con- 
strained to the TEMo9 mode by using a 2-mm aperture inside the 
cavity near the spherical mirror. The 514.5-nm line is selected by the 
transmission characteristic of the output mirror.* The output mode of 
the laser has a 1/e-amplitude radius? of 200 pm. The train of lenses and 
mirrors (see Parts I and II) which is used to direct the laser output 
to the optical modulator was aligned by autoreflection at each mirror. 
The lenses were inserted after the beam had been correctly positioned. 
Back reflections from each lens were used to center accurately that lens. 

The optical modulator must be positioned to the Bragg angle. The 
angle is set by periodically exciting the modulator and then detecting 
the deflected beam with a photodetector and maximizing the modula- 
tion. After the modulator is positioned, the writing-beam separation 


* The reflective band of the transmission mirror is centered near 550-nm wave- 
length. The edge of the band is at 514.5 nm and thus the reflectivity at all the 
other spectral lines is insufficient for oscillation. 
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mirror (see Part I) is positioned. A 10-cm focal length lens placed at 
the modulator output produces the required spatial separation of the 
writing and coding beams. At the separation mirror each beam has a 
1/e-amplitude radius of 50 pm and the center-to-center beam spacing is 
400 ym. At this location, the coding beam is 20 to 50 times the intensity 
of the writing beam. The light from the coding beam which is scattered 
in the writing beam direction is removed by an 0.75-mm aperture 
placed concentric with the writing beam. Slight tilting of lenses elimi- 
nates objectionable back reflections. After these adjustments, the on- 
off ratio of the writing beam is greater than 50. 


III. ACCURACY ALIGNMENT 


The path of the writing beam from the modulator to the scanning 
lens (see Fig. 1 of Ref. 1) is determined by three adjustable mirrors 
in addition to the writing beam separation mirror. These three mirrors 
are used to properly direct the writing beam into the scanning lens. 
However, the proper position of the scanning lens is determined partly 
by the positions of the rotating mirror and photographic plate. Conse- 
quently, the rotating mirror must first be aligned to the photographic 
plate; then the writing-beam illumination of the scanning lens can be 
set and finally the scanning lens is positioned. 

The alignment between the rotating polygonal mirror and the trans- 
lational direction of the photographic plate (Y-axis) is accomplished 
by use of a precision cube and an autocollimator. The cube is mounted 
on the photographic-plate carriage in such fashion that a cube face is 
normal to the Y-axis. Errors are introduced by the yaw, pitch and roll 
of the carriage; each contributes a few are seconds of error. First, two 
faces of the cube are indicated parallel to the Y-axis by using sensors 
capable of detecting 45 »m displacement. The cube face normal to 
these two faces is normal to the Y-axis. The X-axis of the pattern is 
the intersection of a plane normal to the axis of rotation of the poly- 
gonal mirror (this plane is also normal to all of the facets of this 
mirror) and the plane of the photographic plate. The plane of the 
photographic plate must be parallel to the Y-axis or else the X-axis as 
defined above will not always be in the focal plane of the scanning lens. 
A sufficient, but not necessary condition for the X-axis to be normal 
to the Y-axis is to make the carriage travel direction parallel to the 
rotation axis of the polygonal mirror. This is accomplished by using 
an autocollimator to set the reference face of the polygonal mirror (the 
reference face is perpendicular to all the facets of the mirror) parallel 
to the face of the precision cube which is normal to the Y-axis. 
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The actual angle between the X- and Y-axes was determined by 
generating a test pattern on the PPG and measuring this pattern with 
a coordinate-measuring machine (CMM).* This measuerement could 
be made with an error of less than 3 s. Thus, a correction to the direc- 
tion of the rotating mirror was determined and used to reset the X- 
axis. Since this correction was less than 20 s, no other alignment was 
disturbed. 

After the initial positioning of the rotating-mirror axis, the writing 
beam must be directed to the center of the entrance pupil of the scan- 
ning lens. This is set by autoreflecting the writing beam from a prop- 
erly positioned polygonal mirror facet. The proper angle of the facet 
is calculated from the parameters of the scanning lens. The polygonal 
mirror facet is exactly positioned by the use of an autocollimating 
theodolite. The position which must be taken by the axis of the 
scanning lens is now fully constrained. This position is duplicated by 
a helium-neon laser beam which is positioned normal to a facet of the 
polygonal mirror. This facet is first set parallel to the X-axis. The 
He-Ne laser beam is also passed through the center of the scan line on 
the photographic plate. The scanning lens is positioned by centering 
its back reflections of the He-Ne laser beam thereby aligning the axis 
of the scanning lens with the He-Ne laser beam. 

The last step in the X-axis alignment is the length-accuracy adjust- 
ment of the code-plate position. To accomplish this, a replica of the 
code-plate grating is produced by contact printing onto a photographic 
plate. This plate is then positioned in the PPG in exactly the manner 
a photographic plate is positioned when it is to be exposed. A long, 
silicon PIN photodetector is placed under the replica grating. The 
focused writing beam will produce a signal output from the PIN photo- 
detector as it sweeps across the replica grating. However, the long 
photodetector has very little bandwidth. To circumvent this photo- 
detector deficiency, the output of the actual code plate is used to 
modulate the writing beam by feeding the code plate signal into the 
optical modulator. Now the long photodetector under the replica 
grating will only have to respond to the beat frequency between the 
code-plate signal and the writing beam sweeping the replica. By adjust- 
ing the beat frequency to zero throughout the scan, the exact position 
registration between writing and coding beams is obtained. This 
method of alignment resulted in less than 10-ppm error in the X-axis 
length. Residual errors are caused: by camber of the photographic plates 
(see Part I), inevitable temperature variations, and camber in the 
coding-beam output mirror (see Fig. 1 of Ref 1). 
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IV. PERFORMANCE EVALUATION 


The design and fabrication of the necessary high-frequency mechan- 
ical components allowed the synchronization between the fast scan and 
the photographic-plate translation to be accomplished by a simple, 
computer-controlled system. Further, this step-on-command system 
allows flexibility in the computer control so that future work can pro- 
duce a@ more economical division of work between the PPG control 
computer and the PPG postprocessor.’ At present, very few of the 
patterns drawn by the PPG have required the machine to wait for 
the computer to finish assembly of a line. 

The rotating mirror presented the most critical item in terms of 
tolerance. The periodic bunching and spreading of the scan lines caused 
by the nonideal mirror results in both a periodic variation in the opti- 
cal density of exposed regions and a periodic displacement in feature 
edges which are parallel to the Y-axis. The optical density variation 1s 
lost when the pattern is photographed by the reduction cameras, How- 
ever, the periodic displacement is still detectable after the first reduc- 
tion; the peak-to-peak amplitude is less than one-third address. 

The major inaccuracy in the PPG is the Y-axis length. The lead 
screws used are accurate to within 15 ppm at 20°C. However, the lead- 
screw temperature in the operating machine is 25°C and so the Y-axis 
length is in error by 90 to 100 ppm. However, the lead screws can be 
replaced and this error can be eliminated. 

The measured reproducibility of the PPG cannot be separated from 
the reproducibility of the coordinate-measuring machine. It was found 
that remeasurement of a PPG plate on the CMM produced readings 
which showed a variance of one-third address at the extremes of the 
pattern field. Near the CMM reference point in the pattern, the vari- 
ance of the readings was approximately one-sixth address.* Such be- 
havior indicates a systematic error such as that caused by temperature 
differences. If the reproducibility of the CMM is accounted for, the 
variance in the location of a PPG-produced feature is not greater than 
one-third address and may be less than one-fourth address. Figure 1 
shows the measured scatter of identical features drawn on 18 separate 
plates made over a period of two months. The (X, Y) address location 
of the CMM reference was (1000,1375) in the PPG field. The scale 
on the axes of the scatter plots are in addresses with respect to the 
absolute coordinate. Note the error increase in Y caused by the excess 
length of the Y-axis. 

The PPG, as constructed, meets all of the requirements set by the 
mask-making system.® 
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Fig. 1—Reproducibility of pattern generator, 
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Device Photolithography: 


The Electron Beam Pattern Generator 


By W. SAMAROO, J. RAAMOT, P. PARRY and G. ROBERTSON 
(Manuscript received June 8, 1970) 


An electron beam pattern generator is being developed to write directly 
on photographic plates with a 4-um diameter beam over a 5-cm by 5-cm 
field with an address structure of 25,000 by 25,000. Two unique features 
of this pattern generator are random-access computer control of the beam 
and a 15-bit digital-to-analog converter stable to betier than +1 part in 
10°. Capability for drawing 4-um lines having an edge gradient less than 
0.5 um and an optical density greater than three has been demonstrated. 
Stability of better than --1 wm in 24 hours over a 4-mm by 4-mm field has 
been achieved. Experiments still in progress have demonstrated -+-1-yin 
stability over the entire 5-cm by 5-cm field for shorter time periods. Reticles 
of typical complexity are drawn routinely in less than five minutes. 


I. INTRODUCTION 


The demand for integrated circuits is increasing rapidly, and pro- 
jections indicate that the existing mask-making facilities will be 
severely overloaded in the near future. The major portion of the time 
required to make a mask is taken in producing the reticle. The electron 
beam pattern generator was originally conceived to assist the mask- 
making shop by producing reticles rapidly. 

The use of a computer-controlled electron beam also holds promise 
for solving other problems. As integrated circuits become more com- 
plex, it is increasingly difficult to meet the line-width and field-size 
requirements of the final masks. The fundamental limits set by diffrac- 
tion effects are currently being approached; moreover, the depth of 
focus of the lens system producing these masks is so small that severe 
requirements are made on material tolerances. Due to the extremely 
short wavelength of kilovolt electrons, diffraction effects are negligible 
and it is possible to write with beams a few tenths of a micron in 
width over small fields. A. N. Broers et al.t have succeeded in producing 
interdigital surface-wave transducers of 0.3-~m width and 0.7-pm 
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spacing using a modified form of scanning electron microscope. With 
electron beams it is possible to use very high f numbers to give a large 
depth of focus; this relieves the problem of extreme materials toler- 
ances. 

This paper describes an experimental machine built to prove the 
feasibility of drawing reticles on photographic plates. The requirements 
to be met are as follows. The address structure should be greater than 
25,000 by 25,000 over a 5-cm by 5-cm field. The line should be 4 »m 
(two address units) wide and have an optical density greater than two. 
Stability and reproducibility should be within + 1 pm, or + 20 ppm 
(parts per million). As the machine has fast random access rather than 
a raster scan, the writing time is proportional to the area covered, and 
the machine should be able to cover 20 percent of the field within five 
minutes. 

Although the above requirements are sufficient for all of the expected 
reticles for the next few years, they are also sufficient for about 90 
percent of the expected masks. The specifications allow a 4-~m feature 
of one mask level to be registered within an 8-ym feature of another 
level. Because of the method of programming the computer, there is 
very little extra computation time involved in drawing a mask as com- . 
pared with a reticle. 

The electron optical column was built from commercially available 
parts, and does not represent the ultimate in performance. However, 
it has been demonstrated that the electron beam is potentially a very 
valuable tool in the manufacture of reticles and integrated-circuit 
masks. 


Il. DESCRIPTION OF THE SYSTEM 


Figure 1 shows a block diagram of the equipment. A Digital Equip- 
ment Corporation PDP-9 computer is used to generate data for an 
interface to control the electron beam. Input information to the com- 
puter is obtained from design programs such as XYMASK.? The division 
of work between the computer and the interface is best explained by 
describing the technique used to draw patterns. 

As shown in Fig. 2, patterns are drawn using line segments rather 
than picture points. The line segments may be up to 256 addresses long 
and the patterns are filled in at a rate of 1 us per address. This is an 
important aspect of the system as it allows 20 percent of a 25,000 
by 25,000 address structure to be covered in less than three minutes. 

To draw the unfinished feature shown in the blowup of Fig. 2, the 
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MAGNETIC TAPE PDP-9 COMPUTER 
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CONTROL UNIT D/A 
CONVERTERS MATERIALS 


Fig. 1—Block diagram of the equipment. 


coordinates of the four indicated points are read into the computer. 
The start point and length of a single line segment are fed to the inter- 
face. While the interface is controlling the beam to draw that particular 
line segment, the computer is calculating the position and length of the 
adjacent line segment. This division of work between the computer and 
interface minimizes the amount of information to be supplied to the 
system and gives the system a programming flexibility as will be dis- 
cussed later. 

Both familiar forms of graphics, a television like raster and point- 
by-point plotting, were rejected for this system. A raster-type genera- 
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Fig. 2—Method of filling blocks using line segments. 
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tion is very inflexible in comparison to a random-access system and 
requires the transmittal of large quantities of information which is 
generally obtained on a large computer, thus incurring additional costs. 
Point-by-point plotting under the control of the computer would make 
generation times impractically long. Although the pattern generator 
uses line segments, it is a true random-access machine since a line seg- 
ment may be one address long. 

The Electron Beam Machine (EBM), with its associated power 
supplies and the photographic materials which go into the work cham- 
ber, will be discussed in detail in following sections. 


2.1 The Electron Beam Machine 

In the EBM, a beam of electrons writes directly on photographic 
plates thereby utilizing the good resolution inherent in electron beams. 
As in the case of CRTs, where the overall resolution is dependent on 
the phosphors, the resolution of the EBM is limited by the recording 
medium. For reticle and mask generation, where Kodak High Resolu- 
tion Plate (HRP) emulsion is used to achieve the desired plotting 
speed, the resolution or edge definition of a line is limited to about 
0.5 pm. 

The electron optical column consists of a triode gun with a re- 
entrant Wehnelt cylinder, two demagnifying lenses and one projec- 
tion magnetic lens. The two demagnifying lenses are used to pro- 
duce a 4-um-diameter image just below the second lens, and the projec- 
tion lens reproduces this image at a 35-cm working distance. The long 
working distance makes it possible to scan a 5-cm field with deflection 
angles of less than 5°. An electrostatic deflection system is used in 
preference to a magnetic system, in which eddy current and hystersis in 
the chamber walls would reduce the speed and accuracy to below ac- 
ceptable limits. Because of the small deflection angles and apertures 
used, deflection defocusing with the electrostatic system presents no 
problem. 

The current of the 15-kV beam is in the range of 0.1 nA to 1 nA. The 
large f number of the final lens (8000) enables a 4-um beam to be 
achieved using commerically available lenses,* and keeps aberrations 
to negligible proportions. The operating pressure of 10-° Torr is pro- 
duced by a liquid nitrogen trapped 4” oil diffusion pump. 


2.2 Control of the Beam 
As was indicated before, patterns are composed from line segments, 
which may be up to 256 addresses long. Figure 3 shows a block diagram 


* Made by Canal Industrial Corp. (Canalco), Rockville, Maryland. 
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Fig. 3—Block diagram of computer interface. 


of the computer interface which controls the beam during the genera- 
tion of. the line segment. Only the z-axis control is shown; the y-axis 
control is identical. 

To draw a line segment AX(AX = 256) addresses long in the x 
direction starting at (2%, y1), %1, AX and y, are loaded into the X-reg- 
ister, AX register, and Y-register, respectively. The beam is blanked 
at this time but the voltages corresponding to x; and y; are generated 
by the X and Y Digital-to-Analog Converters (DACs) and are applied 
to the deflection system. Therefore, when the beam is unblanked, it 
is at (%41, yi). A start signal from the computer unblanks the beam 
and opens the gate to allow a continuously running clock to increment 
the 8-bit counter, which is initially set in the zero state. The output 
from the 8-bit counter is converted to an analog signal by the eight 
less-significant bits of the 9-bit DAC. The output of the 9-bit DAC is 
attenuated by a factor of 64 and is added to the output of the 15-bit 
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DAC. In this way, a voltage ramp is generated to move the beam from 
x, to x; + AX. The comparator compares the AX-register with the 
8-bit counter and turns off the gate and blanks the beam when they are 
equal. The most significant bit on the 9-bit DAC allows lines to be 
drawn in both positive and negative x and y directions. 

This method of generating line segments offers a number of advan- 
tages. It has already been mentioned that locating each address with 
the computer makes the generating time impractically long. With 
the interface described, the line segment is generated at 1 ps per point. 
If the X-register were incremented directly from the clock pulse 
(thereby eliminating the 9-bit DAC and the 8-bit counter), then, de- 
pending on the initial state of the X-register, some of the more signifi- 
cant bits will change states. When this happens, large transients of 
the output voltage will occur. The use of the AX converter system 
insures that only the less significant bits are switched while the beam 
is on. The sketch inserted to the right of Fig. 3 is meant to convey this 
idea. Experimentally it has been found that switching transients in the 
9-bit DAC appearing at the deflection system are within tolerable 
limits. 

The output of the DAC drives the deflection plates directly. At the 
beginning of a line segment, before the beam is unblanked, 10 ys are 
allowed for the output to settle. While the output is settling, the 
interface is loaded, which takes 11 ps. At the end of a line segment it 
is necessary to wait only 1 ys after the counter stops before blanking 
the beam. 


2.3 High-Precision Digital-to-Analog Conversion 

The best commercially available DACs have 13-bit resolution with 
0.01-percent accuracy. There are DACs with 15-bit resolution and 
0.01-percent accuracy, but these are not consistent with DAC because 
the lesser significant bits are not reproducible. 

Normal practice in DAC is to switch accurately controlled voltages 
through precision resistors using transistor switches for their speed. 
The resistors form a binary series and the currents from the resistors 
are summed through a load. A simplified sketch of a 3-bit DAC of 
this type is shown in the left side of Fig. 4. It is because of the insta- 
bilities across the transistor switches that only 0.01-percent stability 
can be achieved. 

DACs have been developed for this system whereby the voltage is 
regulated after switching as shown in the right side of Fig. 4. 
Notice that parallel current-source regulation is used instead of series- 
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Fig. 4—Simplified diagram of conventional and high-precision DACs. 


voltage regulation. Operational amplifiers which act as current sources 
compare the switched voltages with a reference and compensate for any 
variation by either supplying more or less current to the load. Details 
of this circuit have been described elsewhere.® 

Using such a circuit, it has been shown experimentally that after 
the switch the voltage can be regulated to +0.0001 percent or +1 ppm. 
Only the more significant bits need be built in this fashion. An 18-bit 
system was built, in which 13 bits are conventional and only the five 
more significant bits are regulated in this way. 

The secondary reference source used for comparison includes a pri- 
mary standard and was also made using the principle of parallel-cur- 
rent source regulation. The resistors in the most significant bits and in 
the secondary voltage standard are stable to +1 ppm/°C.* 

An 18-bit DAC was built and tested successfully, but only the 15 
most significant bits are used in the present application. One of the 
measurements made to test the DACs is illustrated in Fig. 5. The input 
to one DAC was held constant while the input to a second one was 
incremented and the output voltages were summed and recorded. The 
arrow indicates the step which resulted when the most significant, or 
first, bit was turned on and all other bits were turned off. Each step 
corresponds to a change of the least significant bit and equals a 4-ppm 
change in the total deflection voltage. It was found that the combina- 
tion of the 13-bit DAC and the high-precision 5-bit DAC was cali- 
~* Resistors manufactured by Julie Research Laboratories, Inc., New York, New 


York, 
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_ Fig. 5—Output, voltage of the 18-bit DAC showing the matching of the most 
significant bit switch within a structure of the least-significant bits. 


brated to better than 4 ppm and that the least significant bit was 
clearly resolvable. Other similar measurements have shown that the 
24-hour stability of the 18-bit DAC is 1 ppm. 


2.4 Programming 


Because the system has random access and because of the particular 
division of work between the computer and its interface, programming 
of the electron beam pattern generator is simplified and results in ele- 
gant solutions. However, the biggest speed factor is the small amount 
of input information required by the system in comparison with other 
graphics systems. 

XYMASK is a rather general program and the output of this design 
program must be “postprocessed” for the particular pattern generator. 
Since the computer of the electron beam pattern generator is available 
for large portions of the drawing time, a considerable amount of the 
computation normally done in postprocessing is done in real time in 
the PDP-9. 

It has been shown how the electron beam pattern generator fills in 
rectangles at the rate of one address per microsecond. Circles, as well 
as complex geometries bounded by quadratic functions and lines of any 
given slope, may also be filled in at the same rate. This was made 
possible by “integer arithmetic,” which avoids the use of multiplication 
and division in determining the boundaries of those geometries.* Algo- 
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rithms based on integer arithmetic, which are run during the free time 
on the PDP-9, allow the programming of complex geometries in real 
time. The programming is described in detail elsewhere in this issue.” 


2.5 Llectron-Sensitive Materials 

The photographic plates used in the EBM consist of 6 »m of Kodak 
HRP emulsion on a flat glass plate coated with a thin layer of chro- 
mium. The glass flatness is better than -+0.27 wm per linear cm 
and the optical density of the chromium layer is 0.04. The chromium 
layer has a resistivity less than 1000 © per square and is used to dissi- 
pate the charge of the electrons. 

An interesting feature of electron beam exposure of these plates is 
that the image formed occurs in about the upper micrometer and a half 
of the emulsion. For projection exposure of the patterns produced by 
the electron beam system, this reduces the depth-of-field requirement 
on the projection system. Experiments on electron sensitization of 
photoresists have been performed at the Western Electric Engineering 
Research Center, Princeton, New Jersey, and are described in the 
following paper.® 


II. STABILITY AND CALIBRATION OF THE SYSTEM 


When drawing a mask, it is essential to maintain stability for at 
least the time required to complete the pattern, typically five minutes. 
In order to insure registration of mask levels made at different times, 
it is essential to be able to maintain calibration over a long period of 
time. The EBM has been designed to have a short-term stability and 
long-term recalibration capability of better than +1 wm or +20 ppm 
over the entire field. This section contains a discussion of systematic 
and random errors and a description of the calibration method. 


3.1 Systematic Errors 

The reproducibility with which the beam can be deflected a distance 
y is related to the stability of the accelerating voltage V,, the deflec- 
tion voltage Vz, and the distance LZ from the deflection plates to the 
sample by the equation 


Ay _ AV 
Y 2V, 


AV, 
2Va 








AL 
7% + oF" (1) 


These errors are all zero at the center of the field and increase out to 
the edge of the field. The expression Ay/y in equation (1) is the frac- 
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tional change in y which occurs for a given set of instabilities, meas- 
ured from one corner of the field. Since every point on the reticle must 
be within the specified tolerance, it is not sufficient to require, for 
example, that the root mean square of the three terms on the right 
side of equation (1) be less than +20 ppm; rather, the sum of their 
absolute values must be less than 20 ppm. 

The steps taken to insure good stability and low transients in the 
deflection voltage have already been described. High-voltage stability 
was obtained by using a commercially available* 0- to 20-kV supply 
with estimated stability under constant load of better than +10 ppm 
per hour. Variations in the distance Z can arise either from surface 
non-uniformities on the electron-sensitive material or from a lack of 
reproducibility in referencing successive samples to the top of the sam- 
ple holder. Surface variations are held to less than +0.5 ppm and 
+5 ppm is allowed for referencing successive samples. 


3.2 Random Errors 


In addition to the sources of errors just described, there are two 
sources of random error which must be considered. First, any insulating 
material in or near the path of the beam will tend to charge to the 
cathode potential, and the resultant electric field will cause the beam 
to deflect away from the computed position on the electron-sensitive 
material. Experiments conducted with this system indicate that charg- 
ing of the electron-optical components is not a problem when proper 
cold-trap techniques are used. A small charging effect was observed 
under worst-case conditions when photographic emulsion on glass sub- 
strates was used without any metallic underlay; however, no detectable 
effect was observed on hundreds of samples with metallic underlay. 

Second, time-varying magnetic fields at any frequency from essen- 
tially d.c. to several kHz (in particular 60 Hz) limit reproducibility by 
deflecting the beam from the programmed position by an amount which 
is proportional to the magnitude of the field and to the square of the 
interaction distance. Therefore, in a system with a long working dis- 
tance, it is especially important to shield against magnetic distur- 
bances. This problem is made difficult by the fact that the shielding 
must extend over a wide bandwidth down to very low frequencies. 
However, successful shielding has been obtained in the past by using 
an enclosure made up of successive layers of highly conductive mate- 
rial and of high-permeability magnetic material. A simple multi-layer 


* Power Design Model HV 1584-R, produced by Power Design, Inc., Westbury, 
New York. 


ELECTRON BEAM PATTERN GENERATOR 2087 


shield which was constructed has reduced deflections due to 60-Hz 
magnetic fields and to slow changes in the field of the earth to less 
than +1 pm. A larger shield is being designed®? to enable the electron 
beam apparatus to operate in magnetic environments somewhat noisier 
than those found in most research laboratories. 


3.3 Calibration 

The calibration technique, which uses electron-beam-induced sample 
current, is shown schematically in Fig. 6. The alignment target con- 
sists of a gold grid on a chromium substrate. When the electron beam 
is swept over the target, the current through the picoammeter varies 
due to the different back-scattering coefficients of gold and chromium. 
Figure 7 shows a chart recording of the changes in the sample current 
when the beam passes over a gold stripe. The stripe can be detected 
with a S/N of better than 100 and its position can be determined to 
within +0.5 wm. The calibration will be accomplished by adjusting the 
accelerating voltage or the deflection voltage to maintain a constant 
number of address units between the fiducial marks. 


IV. RESULTS 


4.1 Electron Beam Writing Characteristics 

Figure 8 shows a photograph of two intersecting lines written with 
the electron beam on the HRP emulsion with the chromium underlay. 
These lines are 4 wm wide and were written by single passes of the 
beam. The edge fuzziness of the lines is less than 0.5 »m, and no round- 
ing off can be observed at the corners of the intersection. The optical 
density of the lines is about three. 


ELECTRON ~~ _ 
BEAM ~ 








GOLD ~ -7 CHROMIUM 
N 7 


/ 


Fig. 6—Diagram of the calibration target. 
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_Fig. 7—Recording of calibration output. One complete cycle of the alternating 
signal gives four equally spaced reference marks, each of one address unit. To 
calibrate, number of address units between fiducial marks is held constant. 


4.2 Reticle and Mask Patterns 


Figure 9 shows a reticle produced by the electron beam pattern gen- 
erator over a 5-cm by 5-cm field. This pattern is a lower-level metalli- 
zation beam cross-over test reticle. The information for this pattern 
was obtained as an input deck to xyMask. It was run on XYMASK on an 
IBM 360/50 and postprocessed on the same machine with a post- 
processor written for the electron beam pattern generator. The outer 
edge of a corner of a path is purposely programmed with a circle. The 
generation time for this pattern is about three minutes. Figure 10 
shows a test pattern for XyMASsK, illustrating the sloped-line capability 
of the electron beam pattern generator. This pattern was also gener- 
ated in less than three minutes. 

Figure 11 shows a mask consisting of a 27 by 27 array of the patterns 
shown in Fig. 9. It should be emphasized that the pattern was drawn 
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aoe 8—Photograph of two intersecting 4-um lines formed from single passes of 
the beam. 
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Fig. 9—Reticle produced by the electron beam. 


over the 5-cm by 5-cem field without step-and-repeat in less than eight 
minutes. 


4.3 Stability 

From independent measurements of all the sources of error described 
in Section 3.2, it has been predicted that with the present equipment, 
the reproducibility of a pattern should be better than +1.5 »m over the 
entire 5-cm by 5-cm field. Stability experiments have been performed 
by drawing grid patterns on the same plate at fixed-time intervals 
and measuring any displacements. Stability of +1 »m has been ob- 
served for five-minute time periods. Experiments to measure the sta- 
bility for longer periods of time are in progress. The largest source of 
instability is 60-Hz magnetic fields and the second largest is fluctua- 
tions in the accelerating voltage. 

Experiments performed some time ago over a 4-mm by 4-mm field, 
which was the maximum field of the deflection system at that time, 
showed that stability better than +1 »m could be obtained over a 
period of 24 hours. Measurements were made by two independent 
methods: by the use of a Preco comparator and by contacting two 
plates made 24 hours apart on a single plate using the image integra- 
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tion process developed by R. E. Kerwin and examining the results 
under a high-powered microscope.® 

The following steps are being taken to improve long-term stability 
over the 5-cm by 5-cm field to +1 ym in the near future. A large four- 
layer magnetically shielded enclosure is being constructed. The enclos- 
ure will have a clean-room interior and will have two sets of walls 
separated by a one-foot gap. Each wall will be made up of aluminum 






Fig. 10—Test pattern for xymask illustrating sloped line capability of the 
pattern generator. 


on the inside and molypermalloy* on the outside, separated by approx- 
imately two inches. This shield is designed to attenuate d.c. and 60- 
Hz magnetic fluctuations by factors of at least 200 and 5000, thereby 
reducing all beam position instabilities due to magnetic disturbances to 
less than +0.1 pm. The customized high-voltage power supply is expected 





* Supplied by Allegheny Ludlum Steel Corporation, Brackenridge, Pennsylvania. 
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ii aa 11—Mask level containing an array of 27 x 27 of the patterns shown in 
1g 


to contribute less than +0.1 wm to the beam position instability. The 
-£1-ppm stability of the DAC is more than adequate (+0.05 »m) and 
it is expected that the variations in the plate flatness and registration 
will contribute less than +0.25 pm. 


V. CONCLUSION 


An electron beam pattern generator has been developed which has 
a field size of 5 cm by 5 cm. The line width is 4 »m and the lines drawn 
were shown to have an edge fuzziness of less than 0.5 wm. The optical 
density of lines produced on HRP emulsion by a single pass of the 


ELECTRON BEAM PATTERN GENERATOR 2093 


beam is about three. Experiments on a 4-mm by 4-mm field size showed 
a stability better than +1 ym over a period of 24 hours. In preliminary 
experiments on a 5-cm by 5-cm field an instability of +1 ym for five 
minutes has been observed. Measurements indicate that the instability 
on the 5-cm by 5-cm field over a five-minute time period is caused 
mainly by 60-Hz magnetic fields; high voltage fluctuations become im- 
portant for longer time periods. It is anticipated that improved shield- 
ing and a new high voltage supply can produce long-term stability of 
~1 pm. 

When completed, the electron beam pattern generator may be used 
to produce reticles, which will be compatible with the step-and-repeat 
camera described in a following paper.® Alternatively, the system may 
be used to make masks directly, with a minimum line width of 4 »m 
registering inside an 8-um feature of another level. Although a 4 wm 
line width is not small by electron beam standards, its generation and 
control over a 5-em by 5-cem field without step-and-repeat has not been 
reported before. 

A linewidth of 4 »m over a 5-cm by 5-cm field was selected for reticle 
making. Smaller linewidths can be obtained with better lenses. There 
are many possible applications involving various combinations of line- 
widths and field sizes. The present system can possibly be extended 
to write with a 1 ym linewidth over a 5-cm by 5-cem field or with 
sub-micrometer linewidth over a 1-cm by 1-cm field. In addition, the 
long depth of focus and sub-micrometer resolution capability are im- 
portant characteristics of electron beam systems for pattern generation 
directly on semiconductor slices. 
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Device Photolithography: 


Electron-Sensitive Materials 


By BARRET BROYDE 
(Manuscript received May 27, 1970) 


Certain additives increase the electron sensitivity of Kodak’s negative 
photoresists by a factor of five to seven; others increase the sensitivity of 
AZ-13850 by a factor of two to three. With additives the contrast of the nega- 
tive resists ts increased, leading to sharper edges and higher resolution. 
Some of these additives also increase the light sensitivity of both positive 
and negative resists. A recording system based on a silver halide emulsion 
and containing a conductive underlay is also described. 


I. INTRODUCTION 


An electron beam pattern generator developed at the Western Elec- 
tric Engineering Research Center requires novel recording systems 
that possess high resolution, high sensitivity at short exposure times, 
flat surfaces, and a conductive underlay.: Silver halide emulsions are 
best suited for the generation of reticles by this generator, while for 
the production of one-to-one masks or the generation of patterns di- 
rectly onto silicon slices (thereby avoiding the use of masks) photo- 
resists are the preferred recording media. 

High-resolution emulsions and photoresists were chosen over other 
recording media since they offer the best combination of sensitivity 
and resolution.?*®> (See Table I.) Systems using these two recording 
media are discussed in this paper. 


II. SILVER HALIDE RECORDING MEDIA 


Kilovolt electrons passing through silver halide grains form latent 
images in much the same way as photons.** Although electron scattering 
by the grains causes some loss in resolving power, the edges of lines 
generated by writing electron beams have been found to be as sharp 
as edges made by conventional processes. 

The recording medium required when the writing beam is used for 
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TasLe I—ELrectron Beam Recorpinc MEDIA 





Smallest Spot Flux of 15-keV 





Recorded Electrons Needed 
Recording Media (um) to Record (C/cm?) Ref. 

High resolution silver halide 

emulsion 1-2 j0-9 2,3 
Photoresists 

Negative (KPR, KTFR) 0.25 8-10 X 10-6 4,5 

Positive (AZ-1350) 1 6 X 10-6 6 
Methacrylate resists 0.5 8 X 10-6 7 
Silicone resists 0.4 ~10°% 8, 9 
Polymerization of monomers 

absorbed from the vapor 1.5 X 107 107} 10 
Liquid crystals 30 107° 11 
Ferromagnetics >100 ~1 12,13 
Thermoplastics 10 >5 X 1074 14 
Electrostatic >50 N.A. 15 





reticle generation is made in the following way: Hastman-Kodak coats 
a high-resolution emulsion (649-GH) on glass manufactured by Liberty 
Mirror Company, Brackenridge, Pennsylvania. Seamed 6” X 6” glass 
plates, covered with a Liberty Mirror proprietary PE-81-E conductive 
coating, transmitting 75-80 percent of incident visible light, are used 
so that uniform coatings can be obtained. The majority of the transmis- 
sion loss is in the glass. The surface of the glass is flat to +27 w inch per 
linear inch, which is sufficiently flat so that the total error of +20 PPM 
allowed for the electron beam machine is not exceeded.’ In order to 
meet this flatness specification, glass 0.235 + 0.01 inch thick is used. 
The plates are cut to 3’ X 3” before they are used in the electron beam 
pattern generator. A conductive coating is used to avoid charge storage 
by the recording medium. The proprietary Liberty Mirror coating was 
chosen since it is highly conductive (<1000 Q/square), transparent 
(optical density <0.04), resistant to the precleaning procedure used 
by Kodak before coating with emulsion, and can be applied without 
heating the substrate. 


Ill. ELECTRON RECORDING BY PHOTORESISTS 


It is likely that the chemical reactions that the electrons cause in 
photoresists are the same as those induced by light. Negative resists 
undergo cross-linking!” while positive photoresists are usually con- 
verted to carboxylic acids’? and perhaps lactones.?9 

The direct exposure of photoresist coatings on the surface of silicon 
slices appears to be an attractive means of patterning semiconductor 
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substrates. For this application the exposure time presently required 
for a writing beam is unacceptably long.t Increasing the beam cur- 
rent to reduce exposure times might lead to undesirable thermal ef- 
fects, so work on increasing the sensitivity of photoresists was initiated. 
Both positive and negative resists were examined in order that the 
electron beam pattern generator would never be required to pattern 
more than half the addressable points, thereby minimizing exposure 
times. 


3.1 Chemical Additives to Increase the Sensitivity of Negative Photoresists 

Recently, chemical additives have been found here?° which reduce 
the flux of 15 keV electrons needed to expose negative photoresists 
from 8-10 X 10-° C/em? to 1.5 x 10-° C/cem?. The absorbed energy 
required for full exposure corresponds to 4.2 x 107° eV/cm?.1® Futher 
reductions in the required exposure are anticipated. 

The additives, incorporated into the photoresist solution before it 
is applied, divide into two classes based on their mechanisms. The 
first class, alkyl and aryl compounds of heavy metals, e.g., hexa- 
phenyldilead, reduce the required flux by acting in two ways: (z) they 
increase the capture cross section of the resist so that more energy 
is transferred to the resist layer, and (i) since these compounds are 
readily dissociated into free radicals, they probably reduce the re- 
quired flux by initiating more than the average number of crosslinks. 
The second class of additives, of which benzophenone is typical, does 
not increase the capture cross section but does cause more than normal 
crosslinking. This occurs either because the additives are readily 
dissociated into free radicals which initiate crosslinking, or because 
they are excited to low-lying triplet states after the primary process 
of absorption.?!*? These triplet states decay slowly and transfer energy 
to more distant polymers causing them to crosslink. 

Figure 1 shows the typical effects of an additive. The thickness 
of exposed and developed KTFR is given as a function of the ex- 
posing flux, both with and without benzophenone. Note that there 
is a threshold; that is, no insolubilization occurs below a certain flux 
of electrons. Since negative resists are a subset of crosslinking systems, 
they are not insolubilized until there is an average of one crosslink 
per molecule. Sufficient radiation to form this average number of 
crosslinks per molecule makes part of the radiated resist insoluble, 
giving rise to a threshold. Further radiation increases the thickness 
of the exposed photoresist by insolubilizing more and more of it until 
the maximum thickness is obtained. Similar results have been found 
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Vig. 1—Effect of benzophenone on the exposure of KTFR by 15 keV electrons. 
The initial thickness of the resist was 0.6 wm. 


with KPR. Figure 1 also shows that the slope of the photoresist thick- 
ness versus flux curve is much higher when additives are present; that 
is, the resist with additives is a high contrast recording medium. 

IT has been defined here as a contrast function for photoresists. 
Analogous to y used in photography to specify the contrast of a film, 
T is defined: 





threshold flux 
= flux for maximum thickness 
so that 
O0O<Trsl. 
A high T implies good contrast, giving sharp edges in the patterning 
process. 


Some of the results that have been obtained on increasing both the 
sensitivity and the contrast of KTFR and KPR are shown in Table 
II. Although benzophenone and hexaphenyldilead are equally efficacious 
in reducing the flux required for full exposure, benzophenone is the 
preferred additive since it is more soluble in photoresists. Benzil and 
1,4-diphenyl-1,3-butadiene show behavior quite similar to benzo- 
phenone. 


3.2 The Electron Exposure of AZ-1350—A Positive Photoresist 


The solubility of AZ-1350 films as a function of the exposing flux of 
15 keV electrons is shown in Fig. 2. Typical results obtained with 
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TasiE [I—Appitives Tuat INCREASE THE SENSITIVITY AND CONTRAST 
oF NEGATIVE PHOTORESISTS 


Flux of 15-keV 
Electrons Needed 


Resist Additive to Expose (uC/cm?) r 
KTFR None 8 0.11 
1.0% benzophenone 1.5 0.33 
Hexaphenyldilead (sat.) 1.5 0.33 

2.0% tiphenylbismuth 1.9 0.5 
KPR None 10 0.09 
1.0% benzophenone 1.5 0.38 


newly purchased AZ-1350 are presented in curves A and B and are 
summarized below. At low exposures (<10-§ C/em?) no solubilization 
occurs and no image can be detected in the photoresist. Fluxes between 
10-8 and about 6 x 10-* C/cm? cause an image to form after develop- 
ment; some of the resist is solubilized, but not all can be removed. 
If the irradiation is brought up to fluxes between 6 xX 10-° and 8 
x 10-° C/cm?, then the resist can be fully solubilized. Irradiation with 
fluxes greater than 8 x 10-> C/cm? yields an insoluble spot after de- 
velopment. 


RESIST THICKNESS AFTER 
DEVELOPMENT IN pM 





1.0 1. 2.0 3,0 
LOG FLUX ( nC/cm2) 


Fig. 2—Exposure of AZ-1350 by 15 keV electrons. Curves A and B newly pur- 
chased resist. Curves C and D, newly purchased AZ-1350 containing 2% 
benzotriazole. The initial thickness of resist was 0.6 um. 
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The mechanisms of the reactions involved in the response of AZ- 
1350 to electron radiation have not been determined yet, but it is 
likely that a crosslinking reaction of its phenol-formaldehyde polymer”* 
leads to the insoluble product. It is also likely that the solubilization 
reaction is the same one that occurs with light; quinone diazides are 
converted to carboxylic acids. Solubilization arising from a scission 
reaction of the phenol-formaldehyde polymer in which indiscriminate 
bond breakage occurs and soluble low molecular weight compounds 
result cannot be fully excluded, but it is not probably because scission 
would have to predominate at low exposures and crosslinking at high 
fluxes. 


3.3 Increasing the Sensitivity of AZ-1350 by the Addition of Benzotriazole 

When 2 percent solutions of benzotriazole in AZ-1350 are prepared, 
the flux of electrons needed to solubilize the resist is decreased (curve 
C, Fig. 2). Benzotriazole also inhibits crosslinking that occurs at 
high electron fluxes (curve D, Fig. 2). To date, the lowest fluxes re- 
quired for full exposure are 2 »C/cm?. 

Benzotriazole is not the only efficacious additive. All members of 
the benzotriazole, imidazole and indazole families that are soluble 
and that have a hydrogen atom bound to a nitrogen atom also de- 
crease the flux needed to solubilize AZ-1350. 

A full explanation for the effects of benzotriazole-type additives has 
not been developed yet, but crosslinking inhibition (curve D versus 
curve B, Fig. 2) probably arises from the antioxidant properties of 
benzotriazole; benzotriazole reacts with the free radicals generated by 
the absorption of energy before they cause crosslinking. 


IV. INCREASED LIGHT SENSITIVITY OF PHOTORESISTS 


Patterning semiconductor slices is conventionally done by contact 
exposure through a mask. In this process, a mask is placed on top of 
and in contact with a photoresist-coated slice and the photoresist ex- 
posed by ultraviolet light through the mask. In this way, a contact 
print of the mask is made on the photoresist. The lifetime of a mask 
copy is limited by abrasion of the mask during printing to about 10 
exposures for an emulsion copy and about 150 exposures for a chrome 
copy. More important, contact with the mask results in defects, such 
as pinholes in the pattern and mechanical damage to the photoresist 
coated slice. Defects related to contact printing have been recognized 
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in the past, but their effect on the yield of discrete semiconductor 
devices is small. However, their effect on the yield of high precision, 
large area integrated circuits is much more severe. 

An electron beam pattern generator writing on the sensitized resists 
discussed in the previous section offers one means of patterning silicon 
slices. Projection photolithography systems would benefit if more 
sensitive photoresists were available, since exposure times will be 
reduced and the problems of dust settling on the optics of the exposure 
system and the mechanical instabilities of the system would be mini- 
mized. 


4.1 Increasing the Sensitivity of KPR to Light 


Benzophenone and its derivatives have been reported to increase 
the sensitivity of polyvinyl cinnamate, the polymer in KPR, possibly 
because they increase the absorptivity of the system at longer wave- 
lengths.?* It was found here that benzophenone decreases the time re- 
quired to produce the maximum thickness of KPR films (polyvinyl 
cinnamate and sensitizer) after development, but that the threshold 
flux is not decreased. (See Fig. 3.) This implies that the edge resolu- 
tion of the image is increased, when benzophenone is present, and 
sharper etched lines should result. In neither this case nor in the 
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Fig. 3—Effect of benzophenone on the exposure of KPR by light. The source 
was a 150-W xenon lamp, 100 cm from the resist. The initia] thickness of the 
resist was 0.6 um. 
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case of the sensitization of AZ-1350 discussed below does the sensiti- 
zation appear to arise from increased light absorption since the ab- 
sorption spectra of the resists with and without additives are identical. 


4.2 Increased Sensitivity of AZ-1350 to Light 
The exposure time required to fully solubilize AZ-1350 can be re- 
duced when low concentrations of benzotriazole or similar compounds 


are incorporated into the AZ-1350 film.25> Some results are shown in 
Fig. 4. 


V. SUMMARY 


A photographic recording system suitable for use with a reticle gen- 
enator has been developed. The electron flux rquired to fully expose 
negative photoresists has been reduced from 8-10 pC/cm? to 1.5 wC/ 
cm? by incorporating additives into the resists. Other additives reduce 
the flux required to expose AZ-1350 from 6 pC/cm? to 2-3 pC/em?. 
Further reductions are anticipated for both systems. 

Most of the additives that sensitize the response of resists to elec- 
trons also increase their sensitivity to light. So far, the photon flux 
required for full exposure has been reduced by a factor of 50 to 100 
percent. 
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Fig. 4—Effect of benzotriazole_ on the exposure of AZ-1350 by a 150-W xenon 
lamp 100 cm from the sample. The initial thickness of the resist was 0.6 um. 
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Device Photolithography: 


Lenses for the Photolithographic System 


By DONALD R. HERRIOTT 
(Manuscript received July 10, 1970) 


The edge definition, maximum complexity and accuracy of details 
im photolithographic masks are limited by the performance of the lenses 
in the system. The tolerances on exposure, sensitivity and uniformity of 
the photosensitive materials, and processing are dependent upon the 
amages formed exceeding the minimum quality required. The lenses in 
this system have been designed and fabricated to achieve the best prac- 
tical performance at this time in order to obtain the largest tolerances 
possible. This paper details the design parameters chosen, the construc- 
tions used and the performance obtained by each of the lenses in the 
system. 


I, INTRODUCTION 


There are two classes of photographic mask-making systems. In 
the first class, the pattern is generated through a lens as in a cathode- 
ray-tube plotter or primary pattern generator (PPG), or a lens is 
used to reduce the size of the pattern to that of the circuit being made. 
The maximum complexity of pattern in this type of system is limited 
by the resolution that can be obtained over the field of a lens. 

A second class of systems uses a lens imaging a single small spot 
of light that is moved over an area and modulated to write a pattern. 
In this type of pattern generator the complexity of pattern is limited 
only by the minimum spot size and the area covered. This system must 
be used to draw the mask at the same scale as the final circuit or the 
lens in a reduction camera would limit the resolution. 

Systems in the first class have been chosen for the mask laboratory 
in spite of the resolution limitations because of the speed and flexi- 
bility of the lens type systems for making a wide variety of masks. 
As a result the lenses in the system are the principal limitation on the 
maximum complexity of patterns that can be produced and on the 
quality of the images. 
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The performance of lenses is limited by the wavelength of light, 
the aperture of the lenses, and the aberration correction of the lenses. 
The wavelength of visible light is about half a micron, and it is 
theoretically possible to obtain light distributions in an image having 
eycles of light and dark of about one-half-micron width. Blue light 
can be imaged with better resolution because it has a shorter wave- 
length than green or red light. The wavelength that can be used in 
making masks is limited by the sensitivity of the photographic mater- 
ials, the available light sources and the transmittance of the glasses 
used in the lenses and as a substrate for the photosensitive materials. 

The resolution is also limited by diffraction. It would be necessary 
to bring light to the image from a cone subtending an angle of 180° to 
resolve spatial images with periods of one wavelength. A smaller angle 
of light to an image will limit the resolution to larger detail. The large 
apertures of the lenses used in this system are required for resolution 
of the detail in the masks rather than to collect light. 

The resolution of a lens may also be limited by aberrations. A single 
lens element with spherical surfaces will not image the light passing 
through it from a point in the object to a point in the image. Aspheric 
surfaces could be used to do this for a point on the axis of the system 
but not for points off axis. These defects in the imagery can be greatly 
reduced by combining many elements designed to compensate for the 
aberrations. It is not possible to reduce these aberrations to zero but 
they can be made smaller than the diffraction effects by using complex 
combinations of lenses. 


II. MODULATION TRANSFER FUNCTIONS 


A convenient measure of the quality of an optical image is the 
modulation transfer function (MTF). This is a curve of the contrast 
that is obtained in the image of a sinusoidal intensity target as a func- 
tion of the spatial frequency of the target. Figure 1 shows a series of 
MTF curves for perfect lenses of various aperture ratios. The MTF 
varies from 0 to 1.0 and is the ratio of the contrast in the image to that 
of the target. The spatial frequency scale is in cycles per mm and 
covers the general range of interest in mask-making systems. As you 
can see in Fig. 1, the smaller the f/number, the better the contrast and 
the higher in spatial frequency it extends. Thus, to get a high quality 
image of 25y lines in a reduction camera may require only an f/8 cone 
angle to the image, but good 1» lines in a step-and-repeat camera 
require a lens of f/2 or faster. 
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Fig. —MTF as a function of spatial frequency in an image formed by a 
cone of light of the indicated “f” number. 


This requirement for low “f” numbers for high resolution may seem 
strange to those who are used to stopping down the lens to get a 
sharper image. This is because conventional camera lenses are limited 
in performance by their aberrations and stopping down the lens re- 
duces these aberrations. The best resolution is probably obtained at 
about f/8; the image gets poorer when stopped down beyond that 
because of the diffraction limits shown in the MTF curves. Photo- 
graphic lenses are often used in low-light conditions and the value of 
the increased speed obtained by increasing the aperture is more 
important than the loss in resolution caused by the aberrations. 

In contrast, the large apertures of lenses for mask-making systems 
are almost always picked for resolution rather than speed. It is there- 
fore necessary to reduce the aberrations to values that are small in 
comparison to their diffraction effects. There is still a compromise 
region. A lens for a 2.5h linewidth mask should have an MTF of over 
60 percent at 200 cycles/mm. This could either be obtained with a 
perfect f/4 lens or an {/3 lens with some aberrations. It could also be 
obtained with an f/2 lens with larger aberrations but unless the ex- 
posure speed of the lens were critical, the greater complexity of the 
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f/2 lens would make it more expensive and prone to larger errors in 
fabrication. , 

A second reason to select the smaller aperture is its increased depth 
of focus. When projecting an image directly onto a non-flat silicon 
wafer, this can be of major importance. In making masks on glass it 
determines the flatness tolerance; in all cases it determines the ac- 
curacy to which the cameras must be focussed and the stability of this 
focus. 


III. SYSTEM CONSIDERATIONS 


The lenses used in this mask-making system have been designed for 
practical operation in a production system. The parameters have been 
selected to advance the state of the art in each area and to obtain the 
largest tolerance possible in each operation of the mask system. 

The performance of each part of the system is limited by the lens. 
The 26,000 address width of the pattern generator field is near the max- 
imum that can be obtained with the aperture limits of the scanning 
system. The 5000 linewidth square field of the step-and-repeat camera 
is even more challenging to the lens designer for the small image in- 
volved. The reduction-camera lenses are not as difficult but have been 
designed for higher performance and therefore greater tolerances in use. 

All of the lenses have been designed without major consideration of 
cost as even small improvements in performance would result in opera- 
ting savings in excess of any reasonable cost. 


IV. LENS DESIGN 


The design of specialized lenses of this type is far ahead of the abil- 
ity to manufacture them with uniform quality. In recent years auto- 
matic lens design programs have been developed which efficiently find 
the optimum design from each starting point while placing the desired 
importance on each characteristic. For instance, it has been found that 
designs of the types used are capable of essentially zero field distortion. 
It would be difficult using manual design techniques to find designs 
completely free of distortion. With automatic design programs, a small 
weight on distortion will cause new designs to be selected by the pro- 
grams that are free of distortion until it is necessary to compromise 
other characteristics. The designer can then see just what must be 
sacrificed in one characteristic for gain in the other. 

It is either necessary for the lens designer to learn all of the other 
parameters of the mask system or for the system designer to under- 
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stand the lens design difficulties to arrive at suitable system compro- 
mises. The development of automatic design programs has made it 
reasonable for the system designer to explore the design of the lens 
while designing the system. A variety of lens designs for the lenses of 
this program were explored by the systems designer although the final 
lenses were designed and constructed by an experienced lens design 
group at Tropel, Inc.* In this manner, the system parameters were 
selected, a suitable performance target could be determined, and a 
tentative choice between performance and complexity could be made 
prior to final lens design. 


V. LENS ASSEMBLY 


All of the lenses in the system have maximum wavefront aberra- 
tions of approximately 4/4. They have up to 14 air glass surfaces as 
well as two or more cemented surfaces. The quality of each of these 
surfaces must be very good so that the accumulations of the errors on 
the individual surfaces including the inhomogeneity of the glass does 
not approach the aberration tolerance. The centering and spacing of 
the elements must be of extraordinary quality to maintain the diffrac- 
tion limited performance. Conventional techniques for measuring and 
controlling the centering and spacing of lens elements are not sensitive 
or accurate enough for lenses of this type. The lenses have been as- 
sembled by Tropel using new techniques that they have developed in 
recent years. We have carried out a program at the Laboratories to 
explore improved interferometric techniques that will make even better 
lens systems feasible. 


VI. LENS EVALUATION 


Lenses are now evaluated by photoelectrically measuring the mod- 
ulation transfer function in a lens bench. This is done by scanning the 
image of a periodic target with a slit or the image of one slit with 
a second one and calculating the transfer function. For lenses of this 
quality, the slits must be extremely narrow and the measurement is 
limited by the photon noise of signals through the slits and the stabil- 
ity of the lens bench and air during the time of measurement. One 
measured curve is shown for the 3.5X lens but the measurement is not 
convincing as the curve goes above theoretical values at high fre- 
quency. Wavefront measuring methods are now being developed from 
which better MTF curves should be obtained. 


* Located in Fairport, New York. 
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VII. PATTERN GENERATOR LENS 


The pattern generator lens has very special requirements. It must 
both collimate the laser beam before it is reflected from the polygonal 
mirror and then image the reflected beam to a flat focal plane on the 
photographic plate. The effective aperture position for the lens is at 
the surface of the mirror. The gaussian light distribution in the aper- 
ture of the lens is controlled by the illuminating laser beam. Although 
the lens is corrected at f/10, the writing beam fills the aperture with 
an f{/22 cone angle which gives a 10py-diameter gaussian distribution 
in the image. The code beam fills a larger aperture in the scan direc- 
tion so that a higher modulation is obtained when the image scans 
the 7-um bars and spaces of the code beam. The lens must provide 
a large amount of barrel distortion so that a constant angular rate 
of the scanning mirror provides a uniform linear scan in the focal 
plane. The combination of no vignetting of the laser beam in the lens 
and a uniform linear velocity of the scan gives a uniform exposure 
over the plate. Figure 2 shows the scanning lens and Fig. 3 shows the 
calculated MTF of this design. 


VIII. REDUCTION-CAMERA LENSES 


The reduction-camera lenses image the pattern generator plate onto 
HRP photographic plates. The mercury 485.8-nm spectral line is used 
so that only the monochromatic aberrations are critical. The lenses 
are correct for first-order axial and lateral color at this wavelength. 
The field angle is a compromise between camera length and aberration 
correction. The entrance pupil distance is the same for both the 3.5X 
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Fig. 2—Cross section of pattern generator lens. 
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Fig. 3—MTF curves for the pattern generator lens on axis and at the edge 
of the field in relation to the fundamental frequency of the 7-um address and 
35-um_ linewidth. 


and 1.4X lenses so that the same illumination system can be used for 
both. Microflat glass plates are used in this camera so depth of focus 
is not important. The apertures have been selected to give best image 
quality and an iris is built into each lens so that they can be stopped 
down if poorer quality glass is used. 

The 485.8-nm wavelength was selected as a compromise between the 
better resolution at the shorter wavelength than the more commonly 
used 546.0-nm line, and the smaller amount of scattered light in the 
green. The scattering in the blue is greatly reduced by using the dyed 
emulsion plates that are described in another article in this issue. 


Ix. 3.5X. REDUCTION CAMERA LENS 


The 3.5X reduction-camera lens shown in Fig. 4 is a seven-element 
double-Gauss type operating at f/3.5 and having a focal length 
of 17.7 cm. Efforts were made to use an eight-element design for better 
performance but the improvement was not judged sufficient to exceed 
the probable losses in an extra element. Figure 5 shows the MTF 
curves for this lens on axis and at the edge of the field along with the 
diffraction limit for the lens aperture used. The fundamental frequency 
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Fig. 4—Cross section of 3.5X reduction camera lens. 


for a 10% minimum linewidth used would be at 50 cycles per mm where 
the response is 70 percent or greater. There is significant response at 
a number of harmonics of this frequency to better reproduce sharp 
edges. 

The intensity distribution for a square-wave object can be calcu- 
lated from the response at the various harmonies in the source. Fig- 
ure 6 shows the intensity distribution calculated for this lens from a 
10pz-periodic square wave object, an isolated 10, line at the center of 
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Fig. 5—Measured and calculated MTF curves for 3.5X lens. 
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Fig. 6—Intensity distributions for 10-um-wide periodic and isolated lines on 
axis and at the corner of the field of the 3.5X lens. 


the field, and at the edge of the field. It is important that the slope 
of these curves at the edge of the line be large so that variation: of ex- 
posure caused by light-source fluctuation, photographic-material sensi- 
tivity variation, and developing chemistry, time or temperature will 
not have a large effect on the linewidth developed from the image. As 
can be seen here, the isolated line and periodic lines would-require a 
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Fig. 7—Cross section of 14X reduction-camera lens. 
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Fig. 8—MTF curves for the 14X reduction-camera lens. 


slightly different exposure to both have correct linewidth. While this 
different exposure can be used to obtain accurate linewidth on masks 
having predominantly isolated or periodic lines, only a lens with a good 
MTF will give consistently accurate dimensions on all types of fea- 
tures. 


x. 1.4X REDUCTION-CAMERA LENS 


The outline of the 1.4X lens is shown in Fig. 7. While a double- 
Gauss type could have been used for this lens, this rather unusual 
configuration gave better performance for the specific requirement and 
the size is much smaller than the double-Gauss type. 

The focal length is 32.4 cm and the overall length is 128.4 cm. The 
{/4.15 aperture provides a smaller cone to the image than the 3.5X 
lens but accepts a larger cone of light from the object providing better 
resolution compared to the finest line. 

Figure 8 shows the MTF curves for the 1.4X reduction-camera lens 
and Fig. 9 shows the corresponding intensity distribution for periodic 
and isolated 25-ym lines. The 80 percent MTF at the fundamental 
frequency of the line results in a sharper line edge in the intensity pro- 
file and a resulting larger tolerance in exposure. 
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Fig. 9—Intensity distributions for isolated and periodic lines imaged by the 


1.4X lens. 
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Fig. 10—Performance of a group of photolithographic lenses plotted as the 
number of linewidths per field width as a function of the linewidth at which 0.5 


MTF is obtained, 
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XI. CAPABILITY OF GENERAL PHOTOLITHOGRAPHIC LENSES 


The designs of the lenses in this system, including a 7X reduction- 
camera lens that has not been used, show the general range of per- 
formance that can be obtained. Figure 10 shows the number of thou- 
sands of linewidths per field as a function of the linewidth at 0.6 MTF. 
The shaded region indicates the area of reasonable design. There is 
not a smooth curve through these points as different lens types are 
used. A smoother curve could be drawn for each lens type. The 4X 
projection lens below the shaded area, is limited in aperture and there- 
fore resolution because of the required depth of focus. The 10X step- 
and-repeat lens is a very reliable point as many designers have de- 
signed lenses having these parameters. The step-and-repeat lens is 
described in detail in another paper in this issue. 


Device Photolithography: 


Reduction Cameras: Optical Design and 
Adjustment 


By ERIC G. RAWSON 
(Manuscript received June 2, 1970) 


This paper describes the optical design of the photolithographic reduction 
cameras and discusses in detail several aspects of the illumination system 
including the light source spectrum, the method of attaining even tllumina- 
tion, and the use of a Fresnel condenser lens. The camera design provides 
for first-order correction of focus and magnification shifts due to changes 
in ambient temperature. To adjust the cameras for best focus and proper 
magnification, a new technique using a special test reticle and digital 
computers was developed. It automates much of the procedure and processes 
much more data than would otherwise be possible. The reticle allows simul- 
taneous measurement of focus and magnification errors throughout the 
image field, and a time-shared computer calculates the required corrective 
shifts on the object- and image-spacer bars. 


I. INTRODUCTION 


This paper and the paper immediately following describe the two 
reduction cameras which have been developed to serve as part of 
the photolithographic mask-making facility described in this issue. 

The primary pattern generator! generates artwork masks which are 
nominally 17.5 em square. This size was determined by various optical 
and mechanical considerations. The two reduction cameras reproduce 
these masks at the two specific, reduced sizes required for use as 
masters for tantalum thin film circuits and interconnection substrates; 
the reduced masks from one of these cameras (the 3.5.X camera, shown 
in Fig. 1) can also serve as the reticle in the step-and-repeat camera.” 
The two reduction ratios, together with the corresponding mask size 
and minimum linewidths, are summarized in Table I. In each size the 
minimum linewidth is 1/5000 of the width of the mask. 

This reduction in mask size carried out by the reduction cameras 
must be accomplished without significant loss of resolution in the 
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Fig. i—The 3.5x reduction camera. 
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TABLE I—CoMPARISON OF THE ARTWORK PLATE WITH THE OUTPUT 
PLATES OF THE 1.4X AND THE 3.5X CAMERAS 





Artwork 1.4X Reduction 3.5X Reduction 
Size (max. nominal) 17.5 cm sq 12.5 em sq 5 em sq 
Minimum Linewidth 35 ym 25 wm 10 um 
Use Input to Tantalum Thin Tantalum Thin 
Reduction Film Masks Film Masks, 
Cameras Step-and-Repeat 
Reticle 


minimum-width details and without introducing distortions greater 
than about half of the minimum linewidth. The degree to which these 
two requirements can be met is primarily determined by the resolu- 
tion and distortion characteristics of the reduction lens; the design 
considerations of such lenses are a major topic in themselves, pre- 
sented elsewhere in this issue.* Given a lens of suitable quality, how- 
ever, the camera’s performance is still critically dependent on three 
factors: (7) The mechanical design of the camera must be such as to 
maintain its performance in the presence of environmental perturba- 
tions such as vibration, changes in ambient temperature, and varia- 
tions in the operators’ handling techniques. (72) Camera performance 
is dependent upon the proper design and adjustment of the illumina- 
tion system. (ii) The performance of the camera can be no better than 
one’s ability to adjust the completed camera for best focus and proper 
magnification over the whole image field, not a trivial task for cameras 
in this performance class. The mechanical design of the reduction 
cameras is discussed in the following paper. In this paper, we consider 
the problems of the optical design and the final adjustment. 


II. OPTICAL DESIGN 


Figure 2 shows the optical layout of the reduction cameras. The 
light source is a 100-watt mercury are lamp operating at a pressure 
of about 10 atmospheres. This light source, suitably filtered, diffused, 
and modulated as described below, is imaged by a two-element Fresnel 
condenser lens onto the entrance pupil of the reduction lens. The 
convergent beam illuminates the artwork plate, which the reduction 
lens images onto the output image plate at the right. The image 
plate used is a Kodak Microflat High Resolution Plate with ground 
edges. 
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Fig. 2—Optical layout of the reduction cameras. 


One of the design requirements for.the two reduction lenses was 
that they have equal entrance pupil distances. This allowed us.to use 
a single illumination system design for both the 1.4X and the 3.5X 
cameras. It was found that such a restriction could be imposed on 
the reduction lenses without significantly compromising their design 
performance. : 

A mercury are light source was chosen for reliability and spectral 
narrowness. The 4358 A blue line was selected rather than the 5461 A 
green line because of the extra resolution afforded by the shorter 
wavelength, and because it left open the possibility of direct exposure 
onto photoresist surfaces which are sensitive to the blue but not to 
_ the green light. Although the scattering of light within the emulsion 
(which varies as the fourth power of the light frequency) is greater 
for the blue line, it is not a serious problem in this case, where the 
emulsion is 6 »m thick and the finest structure to be written on it is 
10 »m wide. 

The lamp chosen, a General Electric H100 A4/T, represents a com- 
promise between brightness and spectral narrowness. High-pressure 
lamps, though brighter, exhibit sufficient pressure (Lorentz) broaden- 
ing (see for example Ref. 4) of the 4358 A line to complicate the color 
correction of the reduction lens, which would necessarily compromise 
its overall performance. The lamp brightness results in exposure times 
of 3-4 seconds for the 3.5X camera and 20-25 seconds for the 1.4X 
camera. | 

Two glass filters (Corning Filters No. 3389 and No. 5543) are used 
to isolate the Hg 4358 A spectral line. 

The image of the arc source projected onto the entrance pupil by the 
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Fresnel condenser, although magnified slightly by the Fresnel lens, 
is still too small to fill the entrance pupil. Therefore, a ground-glass 
diffusing screen is placed in front of the mercury lamp to increase 
the apparent size of the light source. The amount of this increase can 
be controlled by adjusting the axial position of the ground-glass dif- 
fuser. This position is adjusted until the diffused image of the source 
just overfills the entrance pupil. 

The accumulation of reflection losses at air-glass interfaces throughout 
the camera results in considerable transmission loss. While this in itself 
is not serious, the difference in the losses experienced by paraxial rays 
and by rays near the edge of the field, which arises because of differences 
in the angles of incidence, results in an illumination intensity which 
falls off seriously with field angle. This intensity fall-off is compensated 
by the intensity corrector plate (see Fig. 2) which has a thin, vacuum- 
deposited absorbing layer of Inconel. The amount of deposited Inconel 
decreases radially from the plate center so as to compensate for the 
field-angle-dependent losses of the rest of the camera. Figure 3 shows 
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Fig. 8—Arrangement used in the vacuum evaporation of the Inconel coatings 
onto the intensity corrector plate. 
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the arrangement used in the vacuum evaporation process. The aperture 
defines a finite area source for the Inconel vapor. The aperture diameter 
and height, and the height of the glass substrate, were adjusted empiri- 
cally to yield the flattest intensity distribution in the image plane. This 
was determined by scanning the image plane with a pinhole, integrating 
sphere and photomultiplier tube assembly. The result is shown in Fig. 
4 which is a plot of the measured illumination intensity as a function of 
position within the image plane. Each horizontal scan extends beyond 
the active image area; the vertical tic marks on each scan delimit the 
actual image field. Figure 4 also shows the plane of constant intensity 
which best fits the measured data. It can be seen that the measured 
intensity distribution deviates from this plane of best fit everywhere 
by less than +7 percent. 

The Fresnel condenser lens* was designed specifically for these cam- 
eras and provides for zero spherical aberration at the particular object 
and image distances of our illumination system. The material is Rohm 
and Haas VM plexiglass. The lens is made up of two elements, each 
approximately 0.060” thick, cemented around the rim face-to-face, as 
illustrated in Fig. 5. Opposing facets on the two halves have dissimilar 
facet angles; these angles were chosen to equalize the optical power 
of opposing facets, thereby minimizing reflection losses. The ability to 
specify the angle of each facet is equivalent to allowing general 
aspheric surfaces on a conventional lens. The result is that axial spheri- 
cal aberration can be eliminated completely from the lens design, min- 
imizing the problem of illumination fall-off with field angle. In 
addition, the ability to specify the angle of the cutback facet assures 
that the scattering of light by this facet will be minimal. The Fresnel 
lens is laterally located in the camera with three corner pins riding 
in radially oriented slots, so as to allow free thermal expansion with- 
out buckling or decentering. 

A Fresnel condenser was chosen rather than a conventional glass 
condenser largely because of the difficulty in obtaining large glass 
lenses sufficiently free of bubbles. Such bubbles, if larger than a milli- 
meter or so, modulate the illumination light cone at sufficiently low 
spatial frequencies to seriously perturb the intensity distribution in the 
image plane. On the other hand, the ring pattern of the Fresnel lens 
facets is at a sufficiently high spatial frequency that its effect is not 
detectable in the image plane. The moiré effects and erratic illumina- 


gece. and fabricated by the Alliance Tool and Die Co., Rochester, New 
ork. 
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Fig. 5—Optical design of the special two-element Fresnel lens. 
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tion uniformity usually seen in Fresnel lens combinations are elim- 
inated by maintaining a tight tolerance on the alignment of the two 
Fresnel elements during fabrication. 

The design and performance of the 3.5X and the 1.4X reduction 
lenses are discussed in another paper in this issue. 


II. TEMPERATURE COMPENSATION 


The reduction cameras are designed to operate in an environment in 
which the temperature is regulated to +0.25°F. Nonetheless, as an 
additional precaution, it was decided to provide first-order compensa- 
tion for changes in focus and magnification due to changes in ambient 
temperature. For this purpose a computer program was written* which 
determines the effects of temperature changes on the focus and magnifi- 
cation of a lens, taking into account the thermal coefficients of volume 
expansion, refractive index, and dispersion of each glass element, and 
calculating approximate changes in air gaps from the thermal expansion 
coefficient of the lens barrel material. It is then possible to calculate, 
using either the ACCOS? lens optimization program or an equivalent 
optimization program, how the object and image distances must change 
with temperature in order to maintain best focus and proper magnifica- 
tion. First-order temperature compensation is then attained by selecting 
the spacer materials to provide the appropriate thermal expansion 
coefficients. 


IV. FOCUS AND MAGNIFICATION ADJUSTMENT 


As the resolution and magnification accuracy demanded of photo- 
lithographic lenses increase, it has become apparent that traditional 
methods of focus and magnification adjustment are impractically slow. 
In order to adjust the reduction cameras properly it was necessary to 
develop an adjustment system which automates much of the pro- 
cedure and processes much more data than would otherwise be possi- 
ble. The system is broadly divisible into three parts: (2) a special test 
reticle which allows simultaneous measurement of focus and magnifica- 
tion errors at nine points distributed over the image plane; (22) a 
computer-controlled, interferometric, coordinate measuring machine® 
to locate fiducial marks on the test image plates and punch their co- 


*The Fortran IV TEACOPS program (Temperature Effects Analysis of 
Complex Optical Systems) is available on request from the author. 

+ ACCOS (Automatic Correction of Complex Optical Systems) is copyrighted 
by Scientific Calculations, Inc., Rochester, N. Y. 
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ordinates on paper tape; and (227) a set of computer programs to 
analyze the paper tapes, establish the current camera errors, and 
calculate the necessary corrective shifts on the spacer rods. The rod 
adjustment mechanism is based on the elastic compression of Belle- 
ville spring washers and is described in detail in the following paper. 

The test reticle is shown in Fig. 6. An 8” x 10” photographic 
plate has a test pattern consisting of horizontal and vertical bar pat- 
terns of various spatial frequencies, arranged in continuous vertical 
stripes. Nine glass prisms are cemented to the face of the plate in a 
3 X 3 array covering the desired object field. Nine identical prisms 
are cemented to the back of the plate to compensate for the refraction 
of the illumination beam. The effect of the first nine prisms is to tilt 
the apparent plane of the object test pattern seen in the prism. Nine 
flat spacer pads displace the test reticle to the rear when mounted in 
the camera, so that the tilted object test patterns straddle the true 
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Fig. 6—The special test reticle used for simultaneous adjustment of focus and 
magnification. 
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object plane. In addition, fiducial marks are placed in the center of 
each prism field, to be used to check magnification. 

The test reticle is placed in the camera and a test image plate is 
exposed and developed. Measurements are made, in each of the nine 
prism areas on the image plate, of the vertical position where the 
pattern appears sharpest. Additionally, the coordinates of the nine 
fiducial marks are measured. This information, when compared with 
reference measurements made on the test reticle itself, yields the 
focus errors in each of the nine prism regions and twelve magnifica- 
tion errors, corresponding to twelve distances between the nine fiducial 
marks. Figure 7 shows focus and magnification error maps printed by 
our time-sharing computer program. The data shown is that for a 
well-adjusted 3.5X camera. Note that the magnification errors, Fig. 
7(b), are at worst 0.37:10,000 and average 0.20:10,000. This is to 
be compared to the maximum allowable error of 1:10,000. Similarly, 
the focus errors, Fig. 7(a), are at worst 9.4 wm and average 4.1 pm. 
These numbers are comparable to the diffraction limited depth 
of focus. 

Other parts of the time-shared computer program calculate (using 
paraxial optical equations) the object- and image-distance shifts nec- 
essary to bring each of the nine prism regions into best focus and 
magnification. These shifts are then fitted (using the method of least 
squares) onto tilted and axially displaced object and image planes. 
Finally, the program calculates the length changes required on each 
of the six spacer rods to bring the existing object and image planes 
into conjunction with the desired planes. 

In general, approximately 6-8 iterations of this correction cycle are 
required to bring the camera into adjustment such as is shown in 
Fig. 7. During the last few iterations, a modified procedure is followed 
in which a test reticle without prisms is used in addition to the prism 
test reticle: the former provides the magnification error data, and the 
latter provides the focus error data, as before. This procedure elim- 
inates the small magnification error introduced by the prisms. Such 
errors amount to about 1:10,000 and can be neglected during the 
first several iterations. | 

Figure 8 compares a resolution test pattern and the corresponding 
image taken with a well-adjusted 3.5X camera. The narrow lines in 
each of the five “L” patterns are (on the reduction camera plate) 4, 
6, 8, 10, and 12 pm. (The finest detail required in normal operation 
is 10 pm.) It can be seen that the 4 wm detail is adequately resolved 
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OF THE IMAGE WHEN 
gN THE CAMERA 


+ 3.499956 
~0.13 
3.500113 
0.32 
+ 3.500025 
0.07 
3.500130 
0.37 
+ 3.499968 
=0.09 


RMS MAGN DEVN = 


0.199, 


3.500027 
0.08 


3.499993 
-0.02 


(b) 


LARGEST MAGN DEVN = 


3.499968 + 
-0.09 


3.500044 
0.13 


3.499875 + 
-0. 36 
3.499923 
-0.20 


3.500021 + 
0.06 


0.370 


Fig. 7—Computer-generated maps of focus errors (a) and magnification errors 


(b) for a well-adjusted 3.5% reduction camera. 
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PATTERN GENERATOR PLATE REDUCTION CAMERA PLATE 


Fig. 8—Photomicrographs of an artwork resolution test pattern (left) and the 
corresponding image plate (right) taken with the 3.5x camera. The image-plane 
linewidths are indicated at the right. The finest image linewidth required in 
normal use is 10u. 


and that the 10 »m (fundamental) detail is well resolved with sharp 
edges. 
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Device Photolithography: 


Reduction Cameras: 
Mechanical Design of the 
3.5X and L4X Reduction Cameras 


By M. E. POULSEN and J. W. STAFFORD 
(Manuscript received July 14, 1970) 


I. INTRODUCTION 


The 3.5X and 1.4X reduction cameras basically employ the same 
structural features differing only in the lenses and focal distances 
required to achieve the desired reductions. Both cameras have been 
designed as fixed-focus cameras in that no adjustment is made on in- 
dividual components to optimize the focus and magnification. 

The camera incorporates the following design features: (7) isolation 
of the camera from building vibrations; (77) temperature compensation 
in the long and short conjugates to compensate for changes in the lens 
due to changes in the ambient temperature; (72) sufficient structural 
mass of individual components and material conductivity to avoid 
local distortions due to rapid changes in ambient temperatures; (7v) 
artwork and image plates automatically positioned to within an ac- 
curacy of about one micron; and (v) exposure control which can be 
varied, with a high degree of reproducibility. 


II. PHYSICAL DESCRIPTION OF CAMERAS 


A rigid camera bed supports the elements as shown in Fig. 1. The 
camera bed is mounted on springs to provide vibrational isolation. The 
welded frame which supports the camera bed-spring assembly contains 
the pneumatic controls, lamp power supply, shutter-control electronics, 
and the Mask Shop Information System (MSIS) lamp and read-out 
supplies. 

The camera bed is made of two GA50 Meehanite cast iron I-beams 
which are connected laterally. The faces of the I-beams have been 
ground flat and parallel in pairs to provide an accurate support for 
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Fig. 1—Schematic of 3.5X and 1.4X reduction cameras, 
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the camera elements. As shown in Fig. 1, a large gusseted Meehanite 
angle bracket is bolted and pinned to the camera. This bracket provides 
the fixed support for the illumination assembly and the lens-and-image 
plate assembly. As will be discussed later, the rods of the conjugates 
are machined along with the assembled lens-and-image plates to ob- 
tain the correct theoretical lengths so as to yield the desired focus and 
magnification. Fine trimming of the lengths is performed utilizing a 
computer program until the optimum lengths are achieved (see Ref. 1). 

The lens plate is mounted on rollers and is free to move along the 
camera bed with changes in ambient temperature. Similarly, the 
end of the rod supporting the illumination assembly is mounted on 
rollers. The illumination assembly consists of the fresnel lens, intensity 
corrector, diffusion screen, and lamp-housing shutter assembly. 

As shown in Fig. 1, a roller support is provided for the image-plate 
structure of the 1.4X camera. On the 3.5X camera, this additional 
support is not needed because of the relatively small short-conjugate 
distance. Figures 2 and 8 are photographs of the 3.5X and 1.4X 
reduction cameras. 


III. VIBRATION ISOLATION 


Providing a vibration-free environment is essential if high-quality 
reductions are to be made. If excited, vibration of the camera bed 
would result in bending of the bed in many modes and thus could 
destroy the focus and magnification of the camera along with the 
alignment of its image relative to the artwork. To eliminate this, 
the 3.5X and 1.4X camera beds were designed to have a free-free 
natural frequency of 100 Hz and the bed shock mounted on springs 
to yield a rigid body natural frequency of 3 Hz. Reference 2 shows 
that if this is the case the natural frequency of the bed coupled to 
the springs is the 3 Hz rigid body mode with the next resonant fre- 
quency occurring at 100 Hz and other frequencies occurring from 
100 Hz on up. From Fig. 4 taken from Ref. 3 one can see that if 
the exciting frequency » is three times greater than the rigid-body 
natural frequency on, the rigid body is essentially isolated from the 
perturbing force. Normally, for most building floor slabs one can 
expect floor slabs to have a resonance of from 12 to 15 Hz and the 
foundation (i.e., base slab or cellar) to have a resonance of around 
30 Hz or higher. Hence, mounting the reduction camera bed at 3 Hz 
isolates it from all the disturbing building frequencies above 10 Hz. 
Furthermore, because of the rigidity of the camera bed, it will behave 
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Fig. 2—3.5X reduction camera, 


REDUCTION CAMERAS MECHANICAL DESIGN 2133 





Fig. 3—14X reduction camera. 


as a rigid body if subjected to excitation below 10 Hz. The 3-Hz shock 
mounts are provided with inorganic (metal mesh) damping and the 
camera bed I-beams have an elastomeric damping compound bounded 
on their webs to provide damping should the camera bed be inadvert- 
ently excited. 

The frame to which this shock-mounted camera is attached was 
designed so that its resonant frequency is 50 Hz, hence eliminating any 
possibility of the support structure being the source of a rigid body 
excitation near 3 Hz. 

The three support rods of the illumination system have a natural 
frequency of 40 Hz in the lowest mode which is lateral bending. The 
three rods of the long and short conjugates have a natural frequency 
of 200 Hz. 


IV. THERMAL DESIGN CONSIDERATIONS 


The reduction cameras are operated in a clean room which is tem- 
perature-controlled to within +0.15°C to maintain artwork and image 
sizes as well as their relative positions. To further assure reproducabil- 
ity even under more adverse ambient condtions, additional design 
features were incorporated. The rod material of the long and short 
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Fig. 4—Transmissibility versus frequency for a single-degree-of-freedom rigid- 
body system. 


conjugates on both cameras was selected to compensate for both focus 
and magnification errors due to the effect of temperature fluctuations of 
the lens itself over +5°C. The length of the long and short conjugates 
vary with temperature in a prescribed manner to accomplish this com- 
pensation. The variation is linear with temperature and is obtained by 
selecting the rod material with the appropriate coefficient of thermal 
expansion. 

The good conductivity of the GA50 Meehanite and the large mass of 
the bed insures that only negligible thermal gradients through the bed 
structure will be encountered and, hence, bending distortion of the bed 
is effectively eliminated. The bed temperature will change uniformly 
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should a change in room temperature occur, thus, preventing degrada- 
tion of focus, magnification, and artwork-image alignment. 


V. MATERIAL SELECTION 


GA50 Meehanite was selected for the I-beams and angle bracket of 
the camera bed because of its good dimensional stability with time 
and its good conductivity. Both the I-beams and angle bracket were 
furnace annealed prior to rough machining and given a vibration stress 
relief after rough machining and prior to final machining. This was 
done to insure the stability of the parts. 

The illumination element holders were made from ground tool plate 
which was annealed to avoid warpage during final machining. 

The lens-holder plate and image-plate structure were made from 
AZ-31 magnesuim plate which has exceptionally good dimensional 
stability. This material provided a rigid yet lightweight structure. 

For the 3.5X reduction camera, the rods of the long conjugate were 
made from Hastelloy X and those of the short conjugate from a 49 
percent nickel iron alloy. These materials were selected because they 
had coefficients of thermal expansion which provided the required tem- 
perature compensation for the lens. 

For the 1.4X reduction camera, the long conjugate rods were made 
from a composite two-material rod of Invar 36 and 49 percent nickel 
iron alloy, and the short conjugate made from a composite two-ma- 
terial rod of stainless steel and 49 percent nickel iron alloy to obtain 
the appropriate coefficient of thermal expansion. 


VI. ADJUSTMENT OF THE LONG AND SHORT CONJUGATES 


A relatively gross adjustment in the mils range (i.e., 10-*” range) has 
been provided on both the long and short conjugates of the reduction 
cameras. In addition, an adjustment in the microinch range (i.e., 10-°” 
range) has also been provided utilizing the technique developed for 
the mirror of the PPG (see Ref. 4). The gross adjustment is provided 
by compressing Belleville springs as shown in Fig. 5, and the fine 
adjustment uses elastic compression of rectangular pads into the metal 
surface to provide the microinch adjustment, the soft spring being 
the bolt itself as shown in Fig. 5. 

The long conjugate is bolted to the angle bracket reference surface. 
This end contains the pad washers which provide the microinch adjust- 
ment. The other end of the long conjugate is bolted to the lens plate 
and contains the Belleville springs used for gross adjustment. 
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Fig. 5—Adjustment mechanism for conjugates of reduction camera. 


The short conjugate is bolted to the lens plate and to the image-plate 
structure. The end bolted to the lens plate contains, the Belleville 
washers for gross adjustment and the end bolted to the image-plate 
structure contains the pad washer for microinch adjustment. The gross 
adjustment provided for each conjugate rod at the lens plate is moni- 
tored with a dial indicator (later removed) capable of being read 
accurately to within 0.0001” and having a range of £0.01”. This per- 
mits the adjustment of the long and short conjugate rods according to 
the computer program discussed in Ref. 1. The pad-washer end of 
each conjugate provides the fine adjustments in the microinch range 
within a range of +0.00025”. For the current reduction cameras it was 
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not necessary to use the microinch adjustment to bring the camera into 
focus and magnification. 


VII. ARTWORK AND IMAGE PLATE POSITIONING 


Both the artwork and image plates must be positioned reproduc- 
ibly against the locating pins. The artwork plate, locating pins on the 
camera are positioned and constructed exactly as they are on the 
pattern generator. Similarly, the image plate is positioned against pins 
which are constructed exactly as they are on the artwork side of the 
step and repeat camera. 

To position the artwork and image plates in the reduction camera 
successfully, the applied forces holding plates against their location pin 
must be greater than the frictional forces. A static analysis, knowing 
the coefficient of friction, allows one to adjust the relative forces in 
the horizontal, the vertical, and the axial directions, such that the plate 
will always seat. 

The artwork is placed into a vertical holder and pneumatically held. 
This holder, supported on bearings, is then pushed into the camera. 
Upon contacting a microswitch, the plate is released from the holder 
and clamped against vertical and horizontal pins. The holder is ejected 
and the plate then located onto the axial pins (i.e., pins parallel to 
the optical axis). To accomplish this, a system of miniature pneumatic 
cylinders utilizing dry nitrogen are used (see Fig. 6). The image plate 
is also located pneumatically. The operations are controlled by pneu- 
matic and electrical components in a drawer located in the bench 
assembly (see Fig. 7). . 

Artwork and image plates have been loaded repeatedly into the 
cameras. Statistical analysis of the data shows that the plates index 
reproducibly. For the nominal eight-inch by ten-inch, one-quarter- 
inch-thick artwork plate it was found that the plates had a mean 
seating error of from eight to thirteen microinches depending on the 
axis measured, with a standard deviation of from five to ten micro- 
inches. For the nominal four-inch by five-inch, one-quarter-inch-thick 
image plate, it was found that the plates had a mean seating error 
of from three to four microinches depending on the axis measured, with 
a standard deviation of from two to eight microinches. 


VIIT. SHUTTER 


For both cameras, it was deemed desirable to be able to vary the 
exposure time from 30 ms to 100 s. To provide uniform exposure, it 
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Fig. 6—Insertion mechanism for artwork. 


is necessary to have opening and closing speeds which are small com- 
pared to the exposure. It was not possible to purchase a shutter hav- 
ing the required 6.3-cm aperture and the necessary range and pre- 
cision of exposure with an opening and closing time of 10 ms. 

A commercial spring-activated shutter was modified to meet this 
requirement. The case A and plate B were retained as shown in Fig. 
8. The leaves were reinforced in the high impacted area, and the leaf- 
activating mechanism was designed to ride on ball bearings to reduce 
the frictional forces. The driving mechanism consists of an opening 
and closing solenoid with their armatures joined at the driving arm of 
the shutter. The solenoids with the shutter are mounted on the lamp 
housing, aligned, and pinned. At the ends of the armatures are 
damping cushions to reduce bounce, and at the ends of the solenoids 
are adjusting screws to insure that the impact force is not absorbed 
by the shutter leaf rotating slot. The shutter driving arm is coupled 
to the solenoid armatures through a slot. The slot is longer than the 
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actuating arm is wide, providing for a 1/16” space at one end towards 
the activated solenoid. The armature travels 1/16” before contacting 
the activating arm, thereby accelerating without the load of the shut- 
ter mechanism. The total armature travel necessary to achieve the 
shutter fully open or closed is 0.188”. The opening and closing time 
takes 10 to 12 ms, depending on the friction in the assembly. This 
is accomplished by the solenoid operating at five times its rated 
voltage. Since the duty cycle is very low, this causes no damage. The 
solenoid voltage is applied for 30 ms. The additional 20 ms is neces- 
sary to keep the shutter from bouncing and to provide damping-down 
time. The shutter solenoids are controlled by a digital timer, consisting 
of a 1-kHz crystal clock oscillator, a five-decade-selector switch, and 
associated integrated circuitry. 

All shutters are acceptance tested to 3000 cycles; life-tested shut- 
ters have run over 100,000 cycles. The life-tested shutter showed signs 
of wear but no signs of imminent failure. 


IX. MASK SHOP INFORMATION SYSTEM 


The primary pattern generator (PPG) records identification codes 
on the plate. This information is used both for visual inspection and 
for automatic identification in the reduction cameras. This con- 
sists of human-readable and machine-readable information. The 
machine-readable information is encoded as a series of clear or 
opaque rectangles located outside the primary pattern area. This 
binary information is read in the camera by a linear array of photo- 
transistors and sent to the MSIS computer which verifies that the 
proper plate has been loaded. 

The detector array is made up of 8 silicon chips, each with six 
phototransistors positioned in a row on a gold interconnection pattern 
on a sapphire substrate. This is mounted on a Bakelite assembly and 
attached to the artwork support structure (see Fig. 9). The diodes 
are located 80 mils from the emulsion side of the artwork plates. On 
the opposite side of the artwork plate is located a special illuminator 
housing consisting of lamps and condenser lenses. To prevent the 
artwork plate from striking the MSIS illuminator housing, it was 
necessary to swivel the illuminator housing out of the way during 
artwork insertion. This was accomplished by means of pneumatic 
cylinders actuated in conjunction with the plate clamping pneumatics. 
Since space did not permit an in-line illuminator source, the housing 
was set off to the side and the light beams were brought into line 
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Fig. 9—MSIS assembly. 


by means of a prism. Each pair of silicon chips (12 phototransistors) 
is illuminated by one of four lamps independently switched on by the 
computer. Each lamp is separated by septums to prevent interference, 
and each lamp has a lens to image the filament at infinity. The 
collimated beam is directed into the prism which reflects the light 
through a linear array of twelve fly-eye lenses which in turn illuminate 
the twelve phototransistors through the information strip on the art- 
work plate. The four lamps are turned on in sequence, and the infor- 
mation is sent to the MSIS computer. 
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X. SUMMARY 


High-quality reduction cameras have been designed which are un- 
perturbed by normal building vibrations and which, due to their mass, 
good material conductivity and temperature compensation of the con- 
jugates are unaffected by reasonable changes in the ambient tempera- 
ture. Insertion of the artwork and image is reproducible. A high-speed, 
wide-aperture shutter, capable of being opened or closed in 10 to 12 ms, 
has also been designed with a life of over 100,000 cycles. 
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Device Photolithography: 


The Step-and-Repeat Camera 


By D. S. ALLES, J. W. ELEK, F. L. HOWLAND, B. NEVIS, 
R. J. NIELSEN, W. A. SCHLEGEL, J. G. SKINNER 
and C. E. STOUT, Jr. 


(Manuscript received July 10, 1970) 


We discuss in this paper the design of a new high-precision step-and- 
repeat camera with respect to its optics, mechanical design, control system, 
and control computer program. One micrometer images from a 6-mm-square 
lens field can be placed within 0.12 wm over a 10-cm X 10-cm area on 
photographic glass plate. Features such as, image plane control, inter- 
ferometric metering, and automatic reticle pattern alignment, are used to 
accomplish these objectives. The control computer with CRT message 
displays for the operator result in an efficient operator-machine interaction. 


I. INTRODUCTION 


In previous papers, the equipment for converting the designer’s 
topography into a primary pattern and the subsequent reduction in size 
have been described. For thin film integrated circuits, the output of 
the reduction camera is the master mask from which working copies 
can be produced for use in fabricating the device. For semiconductor 
devices, however, the output of the reduction camera is ten times 
larger than the required final image size. Thus, a further reduction in 
size is required. In addition, a mask for a semiconductor device con- 
sists of an array of images that are precisely placed on the master 
mask. Thus, the step-and-repeat camera is both a reduction camera 
and, through the use of a moving X-Y stage, permits the placement of 
images in an array covering the desired field of the mask. 


1.1 Requirements 


If a final mask consisted of a single image and if only one mask level 
were required to produce a functioning semiconductor device, the step- 
and-repeat' camera would be a simple tool to design and build. In 
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actuality, a mask is complex. As shown in Fig. 1, a mask consists of an 
array of images, the majority of which are the primary images required 
for fabricating the specific device. In addition to the primary images, 
a wide range of test, secondary continuity and alignment images used 
during the fabrication process are included. Thus, before a mask can be 
produced on the step-and-repeat camera, all of the reticles containing 
the required images must be available. 

In fabricating a semiconductor device, multiple masks are needed, 
each corresponding to a specific processing step. Typically, nine to 
twelve distinct levels are required for integrated circuits. For the mask 
set to be useful, the images in the various levels must be in registration 
from one level to the next. As device features have become smaller, 
the requirement for registration of the mask images from level to level 
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Fig. 1—Typical integrated-circuit photo mask and its various patterns. 
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has become more stringent. In addition, certain classes of devices such 
as the silicon target for the Picturephone® camera tube, though basic- 
ally simple, must be produced by field-butting techniques that require 
the step-and-repeat camera to provide precise image placement. 

The design objectives for the step-and-repeat camera are summa- 
rized in Table I. The first three items, resolution, image field size and 
distortion, are established by the optics of the system. The image place- 
ment accuracy and array size, items 4 and 5, are determined by the 
mechanical stage and the position sensing and control system. The last 
item, the operating time, is of interest because it relates to the balanced 
operational capability of the entire mask-making system. 

Past experience with step-and-repeat camera operations has revealed 
that a camera capable of the performance listed in Table I does not 
guarantee error-free operation, i.e., high yield. The major cause of 
low camera output is operator error. Thus considerable attention has 
been given to eliminating, where possible, operator tasks that have been 
shown to result in errors. 


1.2 General Description 

The step-and-repeat camera described in the following sections, is a 
single-head, ten-times-reduction camera mounted over a moving X-Y 
stage supported and guided by air bearings. Table-position is deter- 
mined using double-pass interferometers for both X and Y axes. The 
physical arrangement of the completely assembled camera is shown in 
Fig. 2. The physical size is approximately 1.2 m in width and depth 
and 1.5 m high. The camera and stage systems are on the operator’s 
left; operator displays and controls are on his right. 

The glass photographic plate on which the mask will be made is 
positioned in a fixture on the camera’s stage. The reticle whose pattern 
will be projected onto the photographic plate is located below the 
hinged cover which supports the flash lamp and condenser housing. The 
camera’s status and operator instructions are given on the lighted 
message board and all normal operator controls are provided on the 
operator keyboard. A more extensive set of controls is available for 
camera maintenance. 

All of the camera functions are controlled and monitored by a com- 
puter which is located outside the camera’s temperature-controlled clean 
room. The use of a small computer rather than a hard-wired controller 
has allowed the camera’s operation to be flexible, and provides nearly 
automatic operation with a minimum of operator intervention. 

The mask-making sequence is initiated by the operator’s request 
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TABLE I—DEsIGN GOALS FOR THE STEP-AND-REPEAT CAMERA 





1. Lens Resolution 1 pm 

2. Maximum Image Size 5 mm square 
3. Image Distortion 0.1 pm 

4, Image-Placement Accuracy 0.12 um 

5. Maximum Size of the Array 10 cm 

6. Typical Step-and-Repeat Time 1200 s 





(at the operator keyboard) for a new job. The computer in turn re- 
quests a job from the Mask Shop Information System (MSIS). The 
list of required reticles and other pertinent information is displayed for 
the operator on a CRT. The operator loads both the photographic plate 
and the correct reticle and then commands the camera to continue. The 
array information is transmitted from the MSIS computer and the pat- 
tern is step-and-repeated. When the mask is completed or a new reticle 
is needed, the operator is alerted by a message on the display board 
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Fig. 2—The step-and-repeat camera. 
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and by an auditory alarm. Operator errors or equipment malfunctions 
are also indicated on the message board. 

The following discussion of the camera is divided into four major 
sections: Section II, Optical Head Assembly; Section III, Stage As- 
sembly; Section IV, Control System; and Section V, Program. Al- 
though these are discussed separately, the design of all systems, both 
hardware and software, were developed simultaneously with close 
collaboration between personnel to assure a smoothly functioning 
camera design. 


II. OPTICAL HEAD ASSEMBLY 


2.1 General Features 


The optical head assembly is a complete unit containing all the 
optics and electronics necessary to project a pattern on to the mask. 
The salient features of the optical head are: 


(z) It projects a 5-mm-square image with a line-width capability of 
1 wm on photographic emulsion. 

(iz) It can deliver four times the energy needed to expose dyed 
KHRP plates. 

(iiz) It has the power capacity to project six formats per second. 

(iv) Exposures are made while the table is in motion with negligible 
line-width errors. 

(v) It has a built-in auxillary projection system to facilitate center- 
ing the flash-lamp. 

(vt) The reticle is held in place by six pneumatic plungers which 
operate in a programmed sequence to ensure that it is correctly 
positioned against its support pads. It is then optically auto- 
matically aligned to’an accuracy of +-0.25 um, which is equiv- 
alent to a positioning accuracy of +-0.025 um in the image 
plane. 

(vit) The reticle number is electronically identified after it is aligned. 

(vit) A 5 X 7 lamp array can be projected through the main pro- 
jection lens for writing identifying alpha-numeric information 
on the mask. 

(iz) The main structure is thermally compensated to maintain 
focus and magnification over a temperature range of +3°C. 

(x) The assembly is “‘fixed-focus” with no external adjustments. 

(xt) The lens is automatically protected from above by a shutter 
whenever the reticle is removed. 
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2.2 Reticle Format 


The reticle for the step-and-repeat camera is the 4” X 5” output 
plate from the 3.5X reduction camera. The reticle format, shown in 
Fig. 8, has a pattern area of 5.2 em X 6.4 em with the corner bounded 
by a radius of 3.536 em. At one end of the format is a secondary informa- 
tion strip containing 42 binary bits each 0.5 mm square. The data 
recorded in this strip is the drawing number of the reticle pattern. 


At the other end of the format are two fiducial marks that are used 
for automatic-positioning of the reticle when it is mounted in the 
camera. 


2.3 Projection Lens 

The lens, which was designed and manufactured by Tropel, Inc.,* to 
Bell Telephone Laboratories specifications, is a nine-element double- 
Gauss design (See Fig. 4). The object to image distance is 48.37 cm at 
a 10:1 image reduction, the effective focal length is 4.12 cm, the f-num- 
ber is 1.4 at infinity and the spectral range is 436 nm + 7.5 nm. One 
of the design criteria for the lens was to maintain a large working 
distance between the lens and the image plane to provide for the lens 
air bearing which is described in Section 3.1. 

The computed image distortion does not exceed 0.1 »m at any point 
in the projected field. The calculated modulation transfer function 
(MTF) in the center, and at the edge of the field of view is shown in 
Fig. 5. Allowing for a slight decrease in these values due to manufac- 
turing errors, the lens can produce 1-micron lines having a MTF of 
0.4, which is adequate for exposing KHRP plates. 


2.4 Illuminating & Condenser System 

The purpose of the condenser system is to provide adequate illumina- 
tion in the correct spectral range uniformly over the projected field. 
In the step-and-repeat camera, the exposure is made while the table is 
moving so as to reduce the time required to make a mask. Thus it is 
necessary to use a short exposure time to minimize the image blur. 

In this camera the illumination is supplied by an EG&G, FX-76 flash 
lamp with a special glass envelope having a ripple-free front surface to 
minimize illumination non-uniformities. The lamp has a maximum 
rated power input of 15 W which is sufficient to expose six patterns per 
second on dyed KHRP emulsion. The flash duration is about 15 ps 


* Located in Fairport, New York. 
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Fig. 3—Reticle format. 


which allows a maximum table velocity of 0.5 cm per second with a line 
edge blur of less than 0.1 pm. Another point of concern is that the 
jitter-time between the computer command and the onset of the flash 
should be small and constant; experiments of over a quarter of a 
million flashes show that the positioning error due to the jitter-time is 
less than 0.005 »m. The maximum rated energy input to the flash lamp 
is 10 J which is four times that required to expose dyed KHRP emul- 
sion. 

A six-element condenser assembly with a large collection angle im- 
ages the flash-lamp discharge onto the entrance pupil of the projection 
lens. The assembly contains a filter having a transmission bandwidth 
of 15 nm centered at 486 nm and having a uniform transmission to 
within 5 percent over its working area. 


2.5 Mechanical Construction 


The optical head, including its pneumatic control system and reticle- 
positioning electronics, is mounted on a Meehanite casting which 
bridges the interferometrically controlled stage. The head contains the 
projection optics, flash lamp, reticle positioning system, alpha-numeric 
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Fig. 4—Projection lens. 


writing system and the machine language reading detectors (See Fig. 
6). 

The structure consists of three parallel plates separated by rods, the 
center or main plate being fastened to the top of the bridge. The upper, 
or reticle plate is supported on slender rods which permit lateral dis- 
placement of the plate for positioning the reticle. The plate displace- 
ment is controlled by stepping motors driving a low angle cam 
through a large gear reduction. This drive mechanism is mounted on 
the cylindrical housing that encloses the space from the main plate to 
above the reticle plate. A pneumatically operated cover, incorporating 
the flash lamp and condenser system, is hinged at the rear of this 
housing. 

The lower, or lens, plate is suspended from the main plate by three 
rods and attached to the underside of the bridge by a diaphragm that 
prevents vibrations normal to and around the optical axis but perimts 
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displacement along the axis. A housing, which provides the mounting 
for the projection lens, the lens air bearing and the retro-reflectors for 
the reference legs of the interferometer (See Section 3.4), is attached 
to the lower surface of the lens plate. 

The plate and rod structure is designed to reduce the thickness tol- 
erances on the plates. This has been done by grinding flat one reference 
surface on each plate. The rods then terminate on only these surfaces 
(See Fig. 6). Where required, the rods are inserted through holes in the 
plate and fastened to steel disks screwed to the reference surface. A 
similar construction is employed in mounting the projection lens, i.e., 
the lens housing is fastened to the reference surface of the lens plate 
and the lens flange is screwed to the interface surface of the lens 
housing. 

The changes in the lens conjugates necessary to maintain the focus 
and magnification over a temperature range of 38°C was calculated 
from the lens and lens holder design. These values were then used in 
selecting the materials for the rods and the lens housing so that the 
focus and magnification are compensated over the specified temper- 
ature range. 

The head is aligned and focussed before it is installed in the bridge. 
The reference surfaces of the plates are set parallel to each other within 
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Fig. 5—Calculated modulation transfer function curves for the projection lens. 
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Fig. 6—Optical head assembly. 


25 prad by adjusting the length of the rods. Centering of all parts, 
except the reticle plate, is controlled by the initial machining. The 
reticle plate is offset toward the stepping motor drives so that when the 
plate is centered the slender rods are flexed. The flexure of these rods 
causes the plate to maintain contact with the positioning drives. 

The lens conjugates were measured on an optical bench and the rods 
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machined to these dimensions. This procedure can give only approxi- 
mate positioning and adjustments are therefore provided in the design. 
The lens-flange-to-air-bearing distance is adjusted by interposing Hoke 
gauge blocks under the lens flange. This permits controlled adjustment 
in 2.5 ym step and angular adjustments of 50 prad. Final adjustments 
are made by varying the air space between the lens air bearing and the 
mask. The long conjugate is varied by changing three pads that support 
the reticle. A 0.5-mm range in 50-~m steps is provided. Since these 
adjustments are critical, they are not available for operator manipula- 
tion, 1.e., the camera is a fixed focus and magnification instrument. 


The air bearing holds the mask to within 0.25 um of its theoretical 
position (See Section 3.1) which provides excellent control for image 
quality and introduces an image distortion of only 6 parts per million 
(ppm). 


2.6 Reticle Alignment 


The reticle is secured to its mount by six pneumatic plungers fed 
through throttling valves to give them a programmed sequence to 
insure that the reticle is correctly positioned against the locating pads. 
The locating pads are in the same location as those used in the reduction 
camera to help minimize the centering errors. The two fiducial marks 
on the reticle are imaged with a 5.5X magnification on to two EG&G, 
SGD-444 photodiodes. One fiducial mark is a cross pattern and is 
imaged on to a quadrant detector, for X-Y positioning, and the second 
mark is a straight bar pattern that is imaged on to a bi-cell detector, 
for @ orientation. The arms of the X-Y fiducial mark are 30 um wide 
by 710 um long, and @ fiducial mark is 40 um wide by 1500 um long. 
The X-Y mark is imaged on to the quadrant photo-diode as shown 
in Fig. 7. 

The microscopes that image the fiducial marks on to the photo- 
diodes incorporate the following features: 


(2) A bent optical path, provided by two dove prisms, prevents the 
structure of the microscope from infringing upon the main 
pattern projection area. 

(it) Diode illumination, from an external light source via fiber 
optics, for visual alignment of the diodes. 

(221) External rotation adjustment of diodes for initial alignment. 

(wv) A refractor block, which can be adjusted through the reticle 
plate, for aligning the pattern with the X-Y axes of the table 
to within an accuracy of 5 urad. 
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Fig. 7—Photo diode. 


The output current from each quadrant in the photo-diodes is propor- 
tional to the amount of light focussed on to it, and the difference in 
output from conjugate quadrants is a measure of the reticle displace- 
ment. A schematic of the electronics is shown in Fig. 8. The diode out- 
puts are amplified and converted to a voltage signal to give a reticle 
displacement sensitivity, near the balance point, of 1.6 V/pm. This 
signal is monitored by a voltage comparator having a dead-space 
window of +0.2 V which is equivalent to a dead-band of 0.25 pm. The 
signal-to-noise ratio at the balance position is about 20 to 1 thus en- 
suring adequate signal strength. The outputs from the comparators 
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Fig. 8—Block diagram of reticle alignment electronics, 
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control switching logic which in turn drives three stepping motors. 
These move the reticle via the three low-angle cams to the balance 
position. One step of the motor displaces the reticle 0.02 »m. Experi- 
mentally; the system positioning accuracy was measured to be £0.25 
pm. The total range of motion is +125 »m; however, in order to avoid 
driving the cam through its transition region, the pattern on the reticle 
is required to be within +63 »m of its correct position. The total bal- 
ancing time is about 15 s, which includes a 5-s pause at the end of the 
positioning period to ensure that the system is completely stable. The 
switching logic is then shut down, the stepping motors are put in the 
“hold” position, but the power to the amplifier and the photodiodes 
remains on at all times. Measurements to date indicate an overall drift 
of about 0.2 wm per day, and the drift during the period required to 
expose a complete mask is negligible. The complete electronic package 
includes a sensitivity calibration system that displaces the reticle one 
micrometer and measures the corresponding output voltage. 


2.7 Machine Language Read-Out 

The secondary information strip is monitored by the computer when 
the reticle has been positioned. The 42 bits are read as four separate 
groups for the convenience of transferring the information, each group 
having its own lamp and condenser system. The lamp is imaged on 
to each bit-area by means of a flys-eye lens array mounted just above 
the reticle, and under each bit-area is a photo-transistor for monitoring 
whether the bit area is clear or opaque. 


28 Alpha-Numeric Display 

A 5 by 7 lamp array is available for producing characters 2 mm high 
in the center of the field of view of the projection lens. A flys-eye 
lens array with 35 lenses each 2.5 mm square is placed just in front of 
the lamp array. The effect of the lens array is to increase the amount 
of light collected from each lamp, as opposed to using no lens array 
at all, and to image each lamp as a square on the mask. The required 
exposure time is 1 s which requires that the mask be stationary during 
the exposure. 


2.9 Operation 


The reticle is loaded in the camera by the operator and is clamped 
into position by activating an air switch. The camera cover is closed 
by an air-piston, activated by the operator, and the system is then 
transferred over to computer-control. The computer initiates the re- 
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ticle-positioning electronics which then proceeds to center the reticle. 
If for any reason the reticle has not been positioned after a given 
length of time, the computer turns off the servo-system and informs 
the operator. When the computer has been informed that the reticle 
is correctly positioned, it will then read the secondary information strip 
to ensure that the correct plate is in the camera. The optical head is 
now ready for operation. 


III. STAGE ASSEMBLY 


3.1 Focus Control 

For a lens capable of 1.0-ym line resolution and precise control of 
magnification, the depth of focus is less than 1.0 pm. Since photo- 
graphic plates having a submicrometer flatness are unavailable, an 
automatic focusing system is required to assure well-focused images 
over the entire step-and-repeat array. The flattest commercially avail- 
able photographic plates for use in this camera, are Mirco Flat 
KHRP.* These have an over-all flatness of 6.5 and 16 pm respectively 
for the 4” x 5” and 8” xX 10” plates. If the surface at the plate’s 
perimeter is positioned at the image plane, the remaining portions of 
the plate will be above or below the image plane by as much as the 
flatness specification and the resulting images will be unacceptable. In 
addition to the plate’s lack of flatness, its emulsion is 6 pm thick, thus 
only a thin layer of the emulsion can be in focus, the rest being out 
of focus. This difficulty was overcome by R. E. Kerwin? who dyed the 
emulsion with a material which absorbs strongly at 436 nm. The dyed 
plate is then only exposed at its top surface and it is only this surface 
which must be maintained in the lens image plane. 

To maintain the plate in focus, it is mounted in a softly suspended 
fixture attached to the stage. The plate’s emulsion surface is allowed to 
glide under a stiff air bearing which is attached to the lens housing. 
Since the lens air bearing is in effect a very stiff spring and the plate 
support system is relatively soft, the distance between the photographic 
plate and the lens bearing will be nearly constant for large deflections 
of the plate holding system. From Fig. 9 it is possible to predict the 
performance of this focus control scheme. If K, and K, are the stiff- 
nesses of the plate support and the lens bearing and AY is the deflection 
of the plate supporting fixture due to photo plate distortion, the result- 
ing change in plate-to-lens bearing force is Af. This results in a plate- 


* Manufactured by Eastman Kodak, Inc., Rochester, New York. 


STEP-AND-REPEAT CAMERA 2159 


Vz: 
K LENS_ PHOTOGRAPHIC 
BEARING ~~~ 


~~ — PLATE 





(aC 


K LENS a 
BEARING / 
K FIXTURE 
S ‘ = 
K FIXTURE eee S| 3 
= 
! 
t y_ LITO. ; We, 
a eee ie 
we A 
Oo | 
Ww 
ee a aera 
} 
CHANGE IN LENS / \N FIXTURE DEFLECTION 
BEARING TO PLATE ~~~ DUE TO PLATE 
SPACING DISTANCE —> DISTORTION 


Fig. 9Theoretical prediction of focus control system performance. 


to-lens spacing change of Az. For a given plate distortion the error in 
the emulsion surface location is proportional to K;/K,, which in prac- 
tice is about 40. This results in focus errors of 0.15 and 0.3 pm for 4” X 
5” and 8” x 10” plates respectively. 

The mechanical realization of the focus control may be seen in Fig. 
10. The photographic plate is pneumatically clamped in a fixture. The 
emulsion side, which is up, rests against 14 co-planar pins which locate 
the perimeter of the top surface parallel to the lens’ image plane. The 
plate clamping fixture is in turn attached to the movable step-and-re- 
peat camera stage by four pairs of stressed parallel springs. This 
suspension system is stiff to all rotations and translations excepting 
motion in the vertical direction. In order to minimize the vertical stiff- 
ness (ky) the springs must be horizontal; however, in this position their 
load-bearing capacity is zero. Therefore, to support the weight of the 
plate and its clamping fixture an annular groove covered with a flexible 
diaphragm is provided under the plate-supporting fixture. This is 
inflated until the diaphragm, acting against the main stage, makes 
the springs horizontal and brings the edges of the clamped plate to 
the elevation of the image plane. By inflating the chamber each time 
a new plate is loaded, the correct starting elevation is achieved regard- 
less of differences in plate weight or barometric pressure. The pneu- 
matic counterbalance would add no stiffness to the plate support 
system if it were operated at constant pressure. However, it is more 
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Fig. 10—Focus control system. 


practical to inflate the system to the proper height and close the valve 
than to establish and maintain the correct pressure; therefore, it is 
operated at a constant volume. The combined stiffness of the parallel 
springs and pneumatic counter balance system is between 5.0 x 104 
and 8.5 x 10* N/m. 

The lens bearing is 2.54 cm in diameter and has a central 0.94-cm 
hole through which the image is projected. The placement of this bear- 
ing between the lens and the surface is only possible because of the 
lens’ large working distance. In operation the lens bearing-to-plate 
spacing is 12.5 ym, its stiffness is 4.4 x 16° N/m, and the normal force 
which it exerts against the plate is 44 N. If this load is transmitted to 
the fixture and diaphragm through the photographic plate, it will 
result in a large deflection of the plate and increase the difficulty of 
maintaining good focus. Therefore, an equal but opposing force is 
applied to the lower surface of the plate by a soft air bearing placed 
directly below the lens bearing. This bearing is mounted on a low 
friction pneumatic plunger which is raised into place after the plate is 
loaded. This bearing provides nearly constant upward force on the 
plate regardless of plate bow or taper and does not contribute to the 
system stiffness. A further discussion of the air-bearing design is 
given in Section 3.3. 

The focus control system was evaluated by using a laser inter- 
ferometer in place of the projection lens to measure the relative 
motion between a mirrored plate clamped in the fixture and the lens 
housing. Using typical plates the focus control is able to maintain the 
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elevation of the plate’s surface on the optical center line within +25 pm 
of the image plane. 


3.2 Stage Design 

All major parts of the camera are supported on a one-meter-square 
block of granite. The top surface of the granite is flat to 2.6 wm and 
three 15-cm-square areas under the stage support bearings are parallel 
to within 5 prad. The block is supported on three special Barry Control 
Corporation* Serva-Level® mounts which provide vibration isolation 
in the vertical and horizontal directions. The vertical and horizontal 
natural frequencies are 0.82 Hz and 0.87 Hz respectively. A 1200-kg 
lead ballast is attached to the underside of the granite to lower the 
camera’s center of gravity and assure the stability of the Serva-Level® 
system. The extra ballast also increases the working pressure in the 
air mounts which minimizes the effects of sudden changes in the am- 
bient pressure due to opening and closing doors in the air conditioned 
facility. 

The stage is supported on three two-inch-diameter air bearings. Each 
bearing is attached to the stage through a spherical bearing assembly 
and a spacer made up of Hoke gauge blocks (See Fig. 11). The spheri- 
cal bearing assembly allows some initial angular adjustment of the 
bearing during assembly and alignment; however, once the weight of 
the stage is resting on the bearing the static friction in the spherical 
bearing prevents further movement. The gauge block stack between 
the bearing assembly and the stage allows the elevation and tilt of 
the stage to be adjusted more accurately than is possible by machining. 
Since the stage rests directly on the granite surface rather than on an 
intermediate stage as is customary in machine-tool construction, its 
attitude depends only on the granite surface and is constant within 
2.5 prad. 

The stage is guided by an intermediate structure which is called the 
cross. The cross is constrained to move only in the X (right-left) direc- 
tion by two pairs of 2.5 cm-diameter air bearings and two colinear 
quartz guide blocks which are secured to the granite base. The sides of 
the guide blocks are straight and parallel to 0.25 »m and the blocks are 
optically aligned on the granite surface to achieve a cross yaw of less 
than 1.25 prad over 10 cm. The cross is supported on four 2.5-cm-di- 
ameter air bearings, two resting directly on the granite surface and 
two on the top surface of the quartz guides. All of the cross support and 


* Located in Watertown, Massachusetts. 
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Fig. 11—Cross section of stage showing cross and drives. 


guide bearings are also assembled using sperical bearings and gauge 
blocks. The stage is also guided by four air bearings and a pair of 
quartz guide blocks mounted on the top surface of the cross parallel 
to the camera’s Y axis. The straightness of travel along this axis is 
similar to the X axis giving a total stage yaw of less than 2.5 prad 
over the entire 10- x 10-cm travel of the stage. Therefore, this is the 
maximum rotational error between any two images on the final mask 
due to the guiding system. 

The parallel springs of the plate holding fixture are attached at 
the outer edges of the stage. This fixture and its diaphragm support rest 
on the top surface of the stages. The elevator bearing for the focus 
control system is attached to the granite surface on the optical axis 
and passes up through a slot in the cross. 

The X-axis drive is attached to the cross through a slender flexible 
column which allows both vertical and horizontal misalignment be- 
tween the cross and drive. Under worst-case conditions, the maximum 
cross rotation due to the force required to deflect this member is 0.1 
prad. The Y-axis drive is coupled directly to the stage through two 
air bearings and a guide bar attached to the stage. 
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All of the camera’s major components are cast from Meehanite GC- 
40 because of its good stability and good damping qualities. The cast- 
ings were X-rayed to assure their soundness and were heat treated 
prior to initial machining, prior to the final grinding operation, and 
again after all machining operations. This heat treatment provides 
phase stability (ferrite and graphite) and assures low creep rates under 
the low stress condtions of its use. All nonmating surfaces were then 
painted with an air dry vinyl paint. 


3.3 Air Bearings 

The stringent requirements of position and focus control necessitated 
that the camera be supported and guided by stiff low-friction bearings. 
Investigation of various bearing types revealed that gas hydrostatic 
thrust bearings possessed the necessary characteristics to meet these 
requirements.2 They operate in a nearly frictionless manner (frictional 
resistance is about 1/4000 of that of a light oil bearing). They are very 
simple in construction permitting relative ease in meeting mechanical 
tolerance requirements. Their load-stiffness characteristics are such 
that camera requirements can be met with low gas-supply pressures 
(<3.4 X 10° N/m?) and total gas consumption (<2.3 x 107? standard 
m?/s). 

The performance and space requirements for the various air bearings 
employed on the camera are indicated in Table II. The plate bearing 
is a central-jet type and the remaining bearings are ring-jet varieties 
(See Fig. 12). 

The design and development of the gas bearings involved both ana- 
lytical and experimental programs. The analytical program consisted 
of computer predictions of steady? and transient? behavior of the afore- 
mentioned bearing types. The experimental program was used to verify 
analytical predictions as well as prove-in the focus-control system. 
Typical results of these programs are shown in Table III and Fig. 18. 


TasiE I]—Air BEARING REQUIREMENTS 





Bearing Load Stiffness Space 
Main Stage 160-220 N 1.2-1.8 x 107 N/m 5.08 em OD 
Cross Support | 36-54 N 0.35-0.53 X 107 N/m 2.54 em OD 
Cross Guides 22-45 N 0.18-0.35 K 107 N/m 2.54 cm OD 
Lens 36-54 N 0.35-0.53 & 107 N/m 2.54 em OD 
Plate Force equal | Zero or Finite but small | 3.81 ecm OD 


to Lens B 
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Fig. 12—Air bearing configurations. 


3.4 Interferometer Design . 

The correct placement of every image on the photographic mask is a 
primary requirement of a step-and-repeat camera. Therefore, special 
attention was given to the design of the camera’s interferometers. Both 
the X and Y interferometers are identical and they share the output 
of a frequency stabilized HeNe laser. Their outputs indicate both the 
direction of stage motion and the distance traveled. Each output 
pulse represents 0.04-um stage travel or 1/16 wave length (A). The 
interferometers are arranged in a double pass configuration so that 
each fringe represents a A/4 displacement of the stage (See Fig. 14). 
Two photocells monitor the output light beams whose phases differ 
by 90°. This phase difference is achieved by the use of circularly polar- 
ized light and polarizers before each photocell, and it is used to indicate 
the stage’s direction of travel and to further divide the output fringe 


TasuE ITJ—Acruau AND THEORETICAL AIR BEARING 


CHARACTERISTICS 
Gas Supply Average Stiffness N/m 
Load Range Pressure © |———_ —______—_ 
Bearing N N/m? Theoretical | Experimental 
Lens & Cross 38-53 1.4 X 10° 4.3 X 105 2.6 X 106 
Support 2.8 X 10° 5.5 X 108 4.4 X 108 
Main Stage 154-220 2.1 X 105 14 X 108 11 X 108 
Support 2.8 X 10° 18 X 108 19 X 10° 
Guide 25-43 2.1 X 105 5.6 X 10° 2.9 X 105 
2.8 X 105 5.8 X 106 4.2 X 108 
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Fig. 183—Theoretical and experimental air-bearing characteristics. 
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Fig. 14—Arrangement of stage position interferometer. 
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interval by four through the use of the two signals’ zero crossings. 
Judicious placement of the interferometers is required to assure the 
maximum accuracy of the camera. For instance the X and Y measuring 
legs intersect the camera’s optical axis thus making the measured loca- 
tion of the image independent of small stage rotations about the verti- 
cal axis. For machines in which the measuring legs do not intersect the 
optical axis, legitimate translations and errors due to rotation are indis- 
tinguishable. (This is always the case with some heads of a multiple- 
head camera.) Similarly, the measuring beams are at the same eleva- 
tion as the photographic plate making the image position independent 
of small amounts of stage pitch and roll. 

The measuring legs are terminated at the stage by 12-cm-long porro 
prisms. The photographic plate is rigidly attached to these prisms 
through the plate-clamping fixture, thus assuring that motions of the 
porro prisms are identical to those of the photo plate. Since the stage- 
drive system continually moves the stage to maintain a particular 
fringe count, its location along the two axes is dependent only on the 
straightness of these prisms and not on the straightness of the stage 
guides. Similarly the orthogonality of the two axes is only dependent 
on the relative mounting of these prisms. On the camera these prisms 
are set at right angles to within 1.25 prad. 

The reference leg retro-reflector is attached to the optical head as 
close to the image plane as is practical. In this way the interferometer 
output represents the relative locations of the stage and the optical 
head rather than the location of the stage with reference to a leg fixed 
in the interferometer body as is the usual practice. This allows com- 
pensation for deflections of the optical head which would otherwise 
go unnoticed. 

The optical parts were fabricated from fused silica because of its 
excellent stability. They use total internal reflection and are not anti- 
reflection coated in order to eliminate the mechanical distortions which 
frequently accompany the deposition of these coatings. They have 
a wave-front accuracy of one-tenth wave over their 12-cm length and 
they are mounted in a nearly stress-free state so their accuracy will 
not be reduced by mechanical strain. 

The laser wave length changes with variations in atmospheric con- 
ditions. Since the room temperature is maintained constant to +0.13° 
C, corrections for temperature are not necessary. However, barometric 
corrections are made prior to each exposure because pressure variation 
of 3.44 x 10° N/m? results in 1.0-ym errors. When wave length cor- 
rections are calculated, the actual difference in the measuring and ref- 
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erence leg lengths must be known. Therefore, the granite table has 
been fitted with sensors which allow the establishment of one absolute 
stage position from which all corrections for varying ambient condi- 
tions are made. 


3.5 Stage Drive 

The stage and cross are driven by identical low-backlash drive sys- 
tems. Motion in the XY and Y directions is imparted to the cross and 
stage through 1.3-em-square bars which are guided on all sides by air 
bearings, and driven longitudinally by a capstan which is an extension 
of the motor shaft (See Fig. 15). Sufficient driving force is achieved 
by pinching the drive bar between the driven capstan and a spring- 
loaded idler. The capstan and idler shafts are mounted so that no net 
transverse force is transmitted to the drive bar and they are prevented 
from moving in the direction of the drive bar by flexures attached to 
the body of the drive unit. The motors are mounted below the granite 
table and their shafts pass vertically through holes in the granite in an 
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Fig. 15—Y axis drive. 
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attempt to minimize any heat transfer from the motors to the camera 
structure. The lack of gearing and the use of flexures minimizes the 
drive train backlash and enhances the system stiffness. 

The drives can move the stage at any uniform velocity up to 0.5 
em/s and they are capable of stopping the stage from this velocity in 
less than 2" interferometer counts (80 »m). They can also maintain 
the stage to within plus or minus two interferometer counts (0.08 »m) 
of its desired position. This is accomplished through the servo system 
which is shown in block diagram form in Fig. 16. 


IV. CONTROL SYSTEM 


4.1 Servo System 

Since the X- and Y-axis servo systems are identical only one will be 
discussed. They operate in three modes: constant speed slewing, de- 
celerating from a contant speed and holding at a fixed location. The 
stage velocity is determined by the value in the twelve-bit speed regis- 
ter which is converted to an analog signal in the digital-to-analog 
(D/A) converter. Thus when the number in the speed register remains 
constant, the D/A output is fixed and the stage runs at a constant 
speed. The stage velocity is stabilized through a digital tachometer 
feedback loop which uses the rate of interferometer pulses to determine 
the stage velocity. The drive system stability was further enhanced by 
the addition of a small viscous damper on the motor shaft. 
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Fig. 16—Block diagram of stage-positioning servo system. 
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When the stage must stop at a given location, the value in the speed 
register is made equal to the distance from the stopping location in 
interferometer counts (0.04 »m per count) and the D/A output voltage 
decreases toward zero as the stopping location is approached. When 
the stage is at the desired location, it is imperative that the value in 
the counters become zero and remain or recross zero in a small limit 
cycle so that each image on the step-and-repeat mask will be in the 
correct location. Ideally when the position error is zero, the stage 
should stop and if it moves slightly, say one count, the motor should 
again drive it to zero. Unfortunately small amounts of drift in the 
D/A converter or the servo amplifier will cause the stage to stop and 
remain at some point with other than zero in the counter and speed 
register. Even without this electronic drift, the system’s static friction, 
although small, will require unreasonably large gains to assure that 
errors of one count in the speed register (0.04-~m stage position errror) 
will be corrected. 

Both problems have been overcome by demanding that the stage 
execute a small limit cycle which includes the zero location. This is 
accomplished by adding to the D/A output a two-valued function 
which is positive when the speed register is positive and negative 
otherwise (See Fig. 17). The magnitude of this step voltage is just 
sufficient to cause the motor to drive the stage toward zero in spite of 
electronic drift and mechanical stiction. This assures a continual re- 
crossing of the zero-minus-one transition point. In addition it reduces 
the positioning error by removing the dead band of 0.04 »m which 
occurs if the stage location corrections are made only when the value in 
the speed register becomes plus or minus one. 


4.2 Digital Computer and Interface 

The control of the camera is coordinated through a Digital Equip- 
ment Corporation PDP-8/L computer. This has a 12-bit word length, 
a cycle time of 1.6 »s and 4096 words of core memory. Its function is 
to provide communication between the operator, the camera and the 
MSIS and to make the necessary conversions and calculations for the 
camera’s operation. The connection to the MSIS PDP-9 computer is 
through an interface and a high-speed data link. The interface pro- 
vides buffering to allow data transfer between the computers to occur 
asynchronously. The information transferred across this link includes 
step-and-repeat array data from the information system, operating 
status, and requests for information from the camera. The computer 
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Fig. 17—Servo amplifier input voltages versus stage-positional error. 


and interface racks are located outside the step-and-repeat camera 
room to minimize any heat transfer to the camera structure. 

The information transferred between the camera and the PDP-8/L 
is stored, acted on, and relayed by three other interface sections, each 
of which consists of about 180 DTL integrated circuits on a board 
with wire wrapped interconnections (See Fig. 18). The three sections 
are the identical X- and Y-axis control interfaces and the accessory 
interface. The latter interface monitors camera functions and provides 
special tests, which are not related to the stage positioning system, 
such as reticle identification, interlock testing, and atmospheric pres- 
sure monitoring. 

Operator/computer communication is also affected through the ac- 
cessory interface which controls both an illuminated message board 
and an auditory alarm as well as storing input from the operator key- 
board. An additional interface allows the computer to output supple- 
mental instructions of a variable nature on a CRT display. 

The X- and Y-axis interfaces interconnect their respective inter- 
ferometers and drives. These interfaces, once loaded from the PDP- 
8/L, are capable of positioning the camera stage at any location within 
the limits of its travel and exposing an image on the plate at that point 
without further intervention from the computer, thus leaving the com- 
puter free for other work. 

Each axis interface has two storage registers which may be loaded 
from the computer, A 24-bit register contains the address of the stage’s 
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Fig. 18—Block diagram of computer/camera interface. 


“next location” relative to the current stage destination, and a 12-bit 
register (the previously mentioned speed register) which contains the 
binary equivalent of the speed at which the stage is to move. The 
heart of the interface is a 24-bit binary up/down counter for accumu- 
lating interferometer output pulses. It consists of four six-bit parallel- 
carry counters connected in series to allow counting in either direction 
at rates of 4 MHz. This counter is initially loaded with a number from 
the “next location” register and the stage is moved until the counter 
becomes zero. At this time the optical head flash lamp may be trig- 
gered and the counter may be automatically reloaded from the “next 
location” register. 

The speed register and the counter are connected to a comparator 
which, upon a command to stop at the next location, will compare 
their contents. When they become identical it will continually transfer 
the value of the counter into the speed register keeping these two equal. 
This makes the value in the speed register proportional to the distance 
from the stopping location thus providing automatic stage deceleration. 
The interface also initiates a computer program-interrupt when its 
counter has become zero. This immediately alerts the computer to the 
fact that its previous requests have been carried out and that the 
status of the interface may be updated. 
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The size and complexity of the axis interfaces justify the inclusion 
of several special functions. These allow the computer, using a special 
program, to perform maintenance tests on these interfaces to identify 
malfunctions and enumerate possible corrective actions. 


Vv. COMPUTER PROGRAM 


The program stored in the step-and-repeat camera control computer, 
the PDP 8/L, couples the camera’s systems into an automatic produc- 
tion tool. Logic sequences of this program could have been provided 
by hardware logic components; however, implementation in this man- 
ner would not allow the flexibility of a computer program and would 
require many more electronic components. The balance between pro- 
gram logic and hardware logic has been established by providing hard- 
ware functions that greatly reduce either program complexity or com- 
puter time and by utilizing program logic where decision making or 
complex hardware logic would be required. The division of logic func- 
tions between hardware and program has been greatly influenced by 
experience with previous Bell Telephone Laboratories computer con- 
trolled photolithographic equipment. 

Features utilized in the program to accomplish the control objectives 
are: 


(i) Live interaction with the operator at all times using non- 
interrupt programming; 

(iz) Input data conditionally accepted at any of three input ter- 
minals with two of the terminals serving a dual use for operator 
control; 

(121) Message board and CRT display used to communicate with the 
operator; 

(iv) Multiple use of the axis-control routine and all other routines 

when posible; 

Overwriting loader areas to provide maximum utilization of 

computer core; and 

(vt) Self starting of the program on loading with a checking routine 
to verify correct loading. 


(v 


~~ 


5.1 Functions Performed by the Program 


The control program’s objective is to provide automatic control of 
pattern placement on a photographic plate. In this operation the only 
operator tasks are: initializing the computer program, installation and 
removal of the photographic plate, and installation of reticles as re- 
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quired. However, in cases of mask shop operational decisions and 
equipment malfunctions, operator intervention is also needed. These 
requirements have resulted in a set of necessary control program func- 
tions: 


(t) Initialize and terminate a table maneuver. 
(iz) Transfer data as needed to the interface. 
(iit) Reproduce a series of text characters at specified locations on the 
photographic plate using the 5 X 7 light array. 
(iv) Automatically zero the table (initialize the interferometer 
counters). 
(v) Read a decimal input format and convert it to binary values 
suitable for interface use. 
(vt) Summon the operator when human intervention is needed. 
(vit) Communicate with the operator through a message board and a 
CRT display. 
(vitt) Check installed reticle for correct identification number and 
control its alignment procedure. 
(iz) Receive input from either the operator, a paper tape in the 
teletype or the MSIS computer as specified by the operator. 
(x) Provide a maintenance founction which transfers control of the 
table to the maintenance keyboard. 


These ten functions are either self descriptive or have been described 
in prior sections. 

The format used to transfer data from outside sources to the step- 
and-repeat control computer include nine code characters: 


Y—Indicates Y-axis coordinate value; 
X—Indicates X-axis coordinate value; 
D—Spacing between images; 
N—Number of images on (D) spacing; 
R—Repeat the last line of data; 
A—Repeat the data preceding the last line; 
E—End the run; 
*—Following characters represent a reticle number; and 
’’__Kinclosed characters are to be written as text on the mask. 


The first four code characters require that a minimum of one digit 
follow them to specify their magnitude with a maximum of seven 
digits. For the convenience of paper-tape input, a decimal point may 
be used with the digits to the left of the decimal point indicating the 
distance in millimeters. If one or two digits are specified and no decimal 
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point is used, the decimal point is assumed to be after the last digit. 
For more than two digits, the decimal point is assumed to be after the 
third digit. 

Whether rows or columns will be run is determined by the sequence of 
the first two code characters following a reticle number or a text request, 
or upon starting a new mask array. If the sequence is Y---X--- the 
program assumes the data following is to be placed in rows with all 
subsequent Ys being the Y coordinates of their respective rows. Al- 
ternately, the sequence X---Y--- signifies column data with the columns 
located at the specified X coordinates. Image locations along a row 
(column) are specified by subsequent X---s or by a D followed by an 
N, each followed by its appropriate digit value. Numbers following 
Ns are evaluated as integers and not by the aforementioned decimal 
format. The only restriction on this format is that all values along 
a row or column must be in increasing magnitude. 


5.2 Program Philosophy 


A block flow diagram of the control program is shown in Fig. 19. 
In this figure bold lines indicate the main path of control through the 
program, light lines indicate paths that are taken when the block’s 
function has been requested and broken lines are paths taken in going 
to the control routines when a waiting point has been reached. Most 
of these waiting points are labeled as “‘gates’’. These gates are points 
in the program that a control function dare not pass until some task 
has been completed. For example, the run and the main gates prevent 
simultaneous operation in either the run area, which is controlling the 
table motion, or the loading area, which is bringing in data and decipher- 
ing coded characters. The keyboard monitor routine maintains contact 
with the operator, allowing intervention in the camera’s operation. 

The program for running the camera has been developed using a 
foreground-background philosophy. Since the primary purpose of the 
program is to control the camera, the foreground program is the set of 
routines which directly control the table motions. The main routine of 
the foreground program is a general running routine which will con- 
trol an axis through its most general maneuver, that of running along 
a row or column and exposing images at specified locations. To ac- 
complish this, the routine requires the table of data for image place- 
ment to be available in the computer core. Because of the urgency in 
transferring information to the interface hardware when a task is com- 
pleted, the foreground program is interrupt addressed. 

Maneuvers of the stage other than the most general are accomplished 
by defeating inappropriate functions in the general running routine to 
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Fig. 19—Program block flow diagram. 


make it perform as required. These changes are made prior to the de- 
sired maneuver and the general running status is restored after the 
maneuver is completed. 

The background program provides input points from the communi- 
cation equipment and communicates with the operator. This program 
operates on a noninterrupt philosophy. Noninterrupt programming is 
used because D.E.C. teletypewriter philosophy will cause the com- 
puter to overflow its limited storage capacity when operating with 
paper-tape input. The teletypewriter’s hardware interrupt has been 
disabled to allow this noninterrupt philosophy. 

To eliminate waiting time for slow input terminals like the teletype- 
writer, the machine waiting points are programmed to return control 
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of the computer to a master-control routine. The master-control rou- 
tine then cycles through the other waiting points searching for work to 
be done. This technique allows the general running routine to be ini- 
tialized and, while waiting for the addressed axis to complete its 
maneuver, the second axis can be initialized and data received from 
the input terminals. 

Control information from the operator is entered through the opera- 
tor keyboard or the teletypewriter. However, if the teletypewriter is 
being used to input data, operator control through the teletypewriter is 
not allowed. 

Output to the CRT display is controlled through the computer in- 
terrupt facility after being initialized by the background program. 
‘This implementation was made because interrupt techniques minimize 
the asynchronous nature of this transmission. 


5.3 Implementation of the Program 

To implement the required functions with the control program, all 
locations in the 4096-word core of the PDP-8/L have been assigned 
an operational use. In doing this all program loaders are written over 
the background routine. The program logic occupies all locations from 
0 through 5777s. Locations 6000, through 6777s are used to store in- 
coming data, the first half from 60003 through 6377, being table 1 and 
the second half, 6400s through 6777, being table 2. Each table contains 
the information for one row or column of images. The technique of 
using two data storage tables allows reading data into one table while 
the second table is being used to run a row or column of images. It 
also allows a mask using only two types of spacings to be produced by 
only transferring the Y- (or X- ) coordinate values for all rows (or 
columns) following the initial table information transfer. 

Locations 7000, through 7777s are used to store the list of reticles 
used to make a mask. This area is also divided into two equal table 
areas. One area will store the list of reticles for the current mask. 
When the data is transferred from the MSIS computer, the table is 
loaded immediately following the initiation of a job. However, when 
the job is being entered through the teletypewriter, this table area will 
store each reticle number at the time the operator is requested to load 
the reticle into the camera. 

The second table area contains the list of reticles used to generate 
the preceding mask. Hither of these two lists may be displayed on the 
CRT display at the operator’s request. 
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Initial loading of the program has been reduced to a “push button” 
operation through a hardware deposited elementary loader. When this 
loading is complete and the computer is started, any tape which is in 
the high-speed, perforated tape reader will be read. In the case of the 
program tape for the step-and-repeat camera computer, the D.E.C. 
binary loader has been incorporated at the beginning with a sufficient 
number of statements added to cause the computer to switch from the 
elementary loader to the binary loader. The binary loader then loads 
the program’s binary tape without the computer coming to a stop. At 
the end of the program, binary tape statements have been included 
which overwrite the binary loader and switch control of the computer 
to a test area to check for correct loading of the program. If the pro- 
gram has been properly loaded, the computer starts the camera con- 
trol program. If the computer must be stopped, a restart location has 
been provided for the operator. 


VI. CONCLUSION 


We have discussed the design of a step-and-repeat camera capable 
of meeting the most exacting integrated circuit mask requirements. The 
requirements for precision image placement, 1-»m line-width resolu- 
tion, and minimum operator intervention have influenced every aspect 
of the camera’s design. A system capable of maintaining the photo- 
graphic surface in focus to within +£0.25 »m was developed in order 
to assure maximum image resolution and correct image magnification. 
The stage guide, drive and measuring systems utilize air bearings and 
multiple-pass interferometers to achieve precise image placement. The 
computer program provides, in addition to camera control, operator 
checking and communication to simplify the operator’s job and to 
minimize errors. 
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Device Photolithography: 


Thin Photosensitive Materials 


By R. E. KERWIN 
(Manuscript received June 3, 1970) 


New camera systems, utilizing lenses of high numerical aperture and 
concomitant shallow depth of focus, require thin recording media. A number 
of materials potentially fulfilling this requirement are discussed. These 
include photoresist-coated metal or semitransparent masks, some uncon- 
ventional photographic processes, and dyed photographic emulsions. The 
use of dyed photographic emulsions is recommended on the basis of sensi- 
twity and improved resolution and modulation of the recorded image. 


I. INTRODUCTION 


In this paper we discuss some recent developments in thin recording 
media in light of their suitability for exposure in the new step-and- 
repeat camera system. The requirements of integrated-circuit pattern 
generation have led to the development of a wide-field objective lens 
for this camera having high numerical aperture (N. A.) corrected for 
diffraction limited performance using monochromatic light. Specifically 
the lens has a 7.1-mm field diameter, f/1.5 at 10:1 conjugate ratio, and 
is corrected for X = 436 nm. 

This lens has a depth of focus shallower than the thickness of the 
photosensitive emulsion on the thinnest high-resolution photographic 
plates. Kodak High Resolution Plates (KHRP) consist of a 6-»m-thick 
Lippman-type emulsion of small (< 0.1 »m) silver halide grains in 
gelatin on a flat glass substrate. For any projection lens of f-number 
less than f/1.7 the depth of focus is less than 6 pm. This is illustrated 
in Fig. 1 which is an idealized ray diagram, drawn to scale, showing 
in cross section a 6-~m-thick photographic emulsion of refractive index 
1.56 into which a linear array of diffraction-limited spots on 1-»m 
centers have been projected through an f/1.5 lens using 436-nm light. 
Even a perfect lens images a point source as a diffraction patch, the 
Airy Disc, having a radius r to the first dark ring given by: 
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re 0.61 _ 0.610 _- (1) 
N.A. nsin 6 

In accordance with this equation, the cone angles (6) of illumination in 
Fig. 1 are functions of the lens aperture, the refractive index (n) of 
the medium, and the wavelength. The width of each rectangular 
shaded area is equal to the radius of the Airy Disc, and the depth of 
focus is approximated as the region of overlap of this with the cone 
of illumination. Light scattering due to the difference in refractive in- 
dices of the silver halide and gelatin is not indicated in the figure al- 

though it is recognized as a major source of image spread. 
From Fig. 1 it is evident that some out-of-focus illumination is 
capable of acting on the photosensitive emulsion. A fraction of this 
out-of-focus illumination will be recorded as a function of the sensi- 


A 


Mv 











Fig. 1—Ray diagram indicating the depth of focus of a linear array of diffrac- 
tion-limited spots 1 wm apart projected into a 6-um emulsion of refractive index 
1.56 through an ideal f/1.5 lens using 436-nm light. 
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tivity of the emulsion and the efficiency of the developing process, thus 
degrading the overall image quality. In the case of large images or uni- 
form-sized small images this could be circumvented by darkroom 
“clipping” so that exposures below a defined threshold would not be 
developed. This is not possible in microelectronic photomask produc- 
tion since the intensities of fine lines near the diffraction limit vary 
as a function of line width. Figure 2 shows the intensity profiles ob- 
tained with our 10X lens for isolated lines of widths 1, 2, 4, and 10 pm, 
all normalized to the same width W. These have been calculated by 
convoluting the modulation transfer function of the lens at the edge 
of the field with the light-distribution function of the object, indi- 
cated by the dashed rectangle. Since the intensity profiles are sym- 
metrical, only one half-cycle is shown. A focal-plane image of a 1-ym 
line would have at its center only 69 percent of the light intensity at 
the center of a neighboring 10-ym line. Thus, “clipping” would result 
in a loss of fine-line image detail. 

Another problem of a depth of focus shallower than the recording- 
medium thickness is the formation of spurious images. As indicated in 
Fig. 1, the regions of overlap of adjacent cones of illumination may 
provide sufficient intensity for exposure giving rise to spurious images 
between the real images.? 

Thus, it is apparent that even high-resolution photographic plates 
must be regarded as three-dimensional systems and for optimum use of 
the new lenses thinner recording media must be obtained. A number of 
approaches to the solution of this problem have been tried and are dis- 
cussed below. 


II. THIN PHOTORESIST FILMS 


It is immediately attractive to those familiar with microelectronic 
photolithography to use photoresists, which have demonstrable high- 
resolution capabilities in thin films, as the required thin recording 
medium. Figure 3 illustrates this approach using the step-and-repeat 
camera to project images into a photoresist coating on metal or semi- 
transparent films on glass substrates. In this case, the thin photoresist 
film (0.3 »m) records the high-resolution image without depth-of-focus 
limitations and, after development, controls the transfer of this image 
by etching into the 0.1-~m film of chromium or iron oxide to provide 
the optical density and hardness required of a photomask. 

However, there are serious problems related to this approach and 
each of these must be solved before this approach can become prac- 
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Fig. 2—Calculated intensity distributions as a function of linewidth for isolated 
slits as imaged by the 10X camera lens. 
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Fig. 83—Ray diagram of projected point sources spaced at 1-um intervals in 0.8- 
pm-thick photoresist coating on 0.1-um-thick film of metal or semitransparent 
mask material (f/1.5 lens, 436-nm light). 
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ticable. These difficulties are the standing-wave effect due to the re- 
flective substrate, the apparent high-modulation requirements of 
photoresist, and the low photographic speed of photoresist. 

Tt is known that exposure of photoresist films on reflective substrates 
leads to the formation of standing waves due to the interference of the 
incident and reflected light waves, and these in turn produce nonuni- 
formly exposed strata in the photoresist.*"’ This effect is already seri- 
ously detrimental in contact printing with polychromatic light, 340 = 
\ S 440 nm, and will be enhanced in our projection printing with 
monochromatic light, \ = 436 + 8 nm. It has been demonstrated that 
the first node or minimum in intensity lies 0.07 um above a chromium 
mask surface so that normal exposure of a negative photoresist in these 
circumstances would result in a 0.07-um developed film which is too 
thin to withstand etching solutions.° Recently, semitransparent masks 
consisting of 0.1- to 0.2-um films of Fe.O3; formed on glass by the 
vapor-phase decomposition of iron pentacarbonyl have been developed 
to facilitate alignment procedures during contact printing onto photo- 
resist-coated silicon wafers.’ The reflectivity of this material is a func- 
tion of its film thickness but is approximately only 50 percent that of 
chromium at 486 nm. The substitution of this mask for chromium 
masks will alleviate somewhat the standing-wave problem. Further 
improvement will be obtained through the use of darker resists, in 
which the reflected light will be a small fraction of the incident light. 

Recent measurements of the characteristic curves of photoresists 
(developed film thickness versus exposure, the slope of which is re- 
ferred to as the gamma of the system) indicate that sharp image-for- 
mation requires relatively high intensity modulation in the projected 
image.® Specifically an 80 percent modulation is required for a 
normally developed 0.4-~m film of Kodak Thin Film Resist (KTFR). 
This is a stringent demand on the optics of the system since it implies 
(Fig. 4) a usable resolution in the resist of only 0.18 the limiting fre- 
quency of an aberration-free system.? Although the use of gentler 
development conditions and dilute developers leads to some lowering 
of the constrast requirement,® the resolution of this difficulty awaits 
the development of higher gamma photoresists. 

The most obvious limitation of present photoresist eosin with 
respect to their use in a flash source step-and-repeat camera is their 
low sensitivity. In Fig. 5 the measured values of the spectral sensi- 
tivities of four types of photoresist and KHRP are presented. The 
KHRP sensitivity refers to the reciprocal of the energy necessary in 
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Tig. 4—The modulation transfer function of an aberration-free system as a 
function of the normalized spatial frequency. w is the line frequency in cycles per 
millimeter, and witm is the high frequency limit imposed by diffraction effects, a 
fe only of the numerical aperture (N.A.) of the lens and the wavelength of 
ig A). 


exposure to reach an optical density of 1.5 on development in Kodak 
type D-19 developer. The measurements on photoresist were carried 
out using 0.2-;m films of negative resist and 0.5-~m films of positive 
resist. The box straddling the 436-nm line represents the measured 
energy output of the camera using type FX-76 xenon flash lamps, 
manufactured by the EG & G Company of Boston, Massachusetts, and 
a 15-nm-wide bandpass filter. Obviously, all the resists fall short of 
the camera requirements and again the need for the development of a 
new class of resists is implied. 


III. UNCONVENTIONAL PHOTOGRAPHIC PROCESSES 


There are a number of recently developed photographic processes 
which would appear, at first glance, to be candidates for the required 
thin recording medium. It is not within the scope of this paper to pre- 
sent each of these in detail or to analyze their current uses; however, 
we merely seek to correlate their sensitivity and resolution limits with 
the requirements of our system. This information is presented in Fig. 
6 in the form of a correlation diagram of the measured or reported 
maximum photosensitivities and resolution limits. The box outline in 
the center of the diagram serves as the goal with an ordinate range 
corresponding to the output per flash of the step-and-repeat camera, 
50 to 250 »pJ/em?, and an abscissa range of 250 to 1000 cycles/mm or 
equivalently 2 to 0.5 wm lines. 

Point A, for Kodak Plus-X film, is shown merely to relate the 
scales used to a familiar system having an ASA rating of 125. B repre- 
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sents KHRP with the vertical spread representing the speed difference 
between plates processed in Kodak D-19 and Kodak HRP developers. 
C represents a dyed version of the same high resolution plates which 
will be discussed in Section IV of this paper. Similarly H represents the 
sensitivity limits of KOR and AZ1350 photoresists at A = 4386 nm as 
shown in Fig. 5 together with resolution limits found in contact print- 
ing these systems. These serve to summarize the other sections of this 
paper showing that photoresists fall short of the goal while the dyed 
plates, with their enhanced modulation, fall within the goal. 


Line D represents an interesting extrapolation of the common photo- 
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Trig. 5—Spectral sensitivity curves of Kodak High Resolution Plates and some 
common photoresists. The box about the 436-nm line indicates the reciprocal of 
the available flash energy range in the step-and-repeat camera. KOR, KTFR, and 
KPR are photoresist formulations manufactured and sold by Eastman Kodak 
Company, Rochester, New York; and AZ1350 is a photoresist formulation sold by 
the Shipley Company, Newton, Massachusetts. 
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Tig. 6—Sensitivity-resolution correlation diagram showing the target region of 
the step-and-repeat camera design and the demonstrated performance of a number 
of unconventional photographic processes identified in the text. 


graphic system to its thinnest version, one without the gelatin matrix. 
This consists of a 0.3-um film of AgBr evaporated on a glass substrate 
which, after exposure, may be developed in common photographic de- 
veloping solutions."” While this process does involve amplification and 
falls within the goal in Fig. 6, it fails as a photomask system because 
of its very low gamma (0.3 S y S 1.5) and its tendency towards in- 
fectious fogging on development, i.e., several AgBr grains develop for 
each exposed AgBr grain. 

The line # and point G represent systems which use photographic 
physical development as their amplification step. The Philips PD* 
process is represented by E# and the Itek RS? process by G. 17 ?? The 
highest-resolution PD-MD1 version of the Philips process does not 
have the speed necessary for our camera. Neither of these systems is 


* N. V. Philips Gloeilampenfabrieken, Eindhoven, The Netherlands. 
+ Itek Corporation, Lexington, Massachusetts. 
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at present commercially available in a form suitable for photomask 
applications. 

Another candidate for the appropriate speed range is photopoly- 
merization in which free radical chain propagation steps should pro- 
vide the necessary amplification. As indicated by F one such system 
has been developed.1? This is based on the photopolymerization of 
barium diacrylate to form either an opaque light-scattering image or a 
clear phase-only image with 0.5-ym resolution. Its speed lies within a 
factor of five of our requirement; but this data is for a polymerization 
in aqueous solution approximately 178 pm thick. 

In Fig. 6, J and J refer to organic color-forming photographic sys- 
tems. The Dupont* Dylux® system J develops an intense blue image 
from colorless precursors on exposure to ultraviolet light; photodeacti- 
vation, or fixing, is carried out by exposure to visible light.1* The “‘free- 
radical photography” J developed by Horizons Inc.* involves the 
photochemical reaction of arylamines and carbon tetrabromide leading 
to a variety of colored images which may be fixed by heating.1®> The 
resolution capability of each is inherently high because of the molecu- 
lar nature of the imaging species but their sensitivity is low because 
they lack amplification steps.* One may calculate the minimum energy 
necessary to expose at 486 nm a unit quantum yield process to achieve 
an optical density of 1.0 assuming an extinction coefficient of 10° cm™ 
(the highest known value) for an organic molecule of density 1.0 and 
molecular weight 300 and result in 0.9 mJ/em?. This upper limit to 
nonamplified photochemical processes is indicated in Fig. 6 by the 
dashed horizontal line. 

Finally, point K represents lead-iodide photography.** Thin evapo- 
rated layers of PbI. become transparent when exposed to blue or ultra- 
violet light at temperatures in excess of 160°C. Similar behavior has 
been observed in other halides, such as BI; and Cdl2, and chalcogen- 
ides, such as PbS and Cds. In all cases the sensitivity is very low re- 
quiring approximately 1 J/cm® for an optical density change of 0.6. 

A wide variety of classes of unconventional photographic processes 
is represented by the above selection, D through K in Fig. 6, none of 
which fulfill the requirements of the step-and-repeat camera. This 
survey does serve to focus our attention on amplified versus nonampli- 
fied photographic processes. 


*E. I. duPont de Nemours & Company, Inc., Wilmington, Delaware. 

+ Horizons, Inc., Cleveland, Ohio. 

* Note added in proof. One version of Horizon’s system is capable of amplifica- 
tion by photo development, but this produces 0.5-um grain sjze, 
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IV. DYED PHOTOGRAPHIC EMULSIONS 


Another approach to the solution of the problem is to utilize thinner 
coatings of the high-resolution photographic emulsion. However, the 
suppliers claim that thinner coatings cannot be produced with the 
same degree of uniformity and quality control. We have suggested 
that as an alternative approach we need only make the usual emul- 
sion effectively thinner.?? It is known that exposure of photographic 
emulsions to ultraviolet light in the region of strong absorption by the 
silver halide results in images confined to the top layers of the emul- 
sion.1* This behavior may be duplicated in other spectral regions by 
dyeing the emulsion such that only the top few microns can be effec- 
tively exposed. The focal plane of the projection system may also be 
confined to this same region by the use of distance pieces or pneumatic 
gauging which ride on the top surface of the emulsion. 

What is required is a nonfluorescent water-soluble dye, strongly ab- 
sorbing of the exposure wavelength, which may be readily and uni- 
formly imbibed by the gelatin and yet may be subsequently removed 
in normal processing so as not to lower the overall image contrast. 
Specifically for 1 = 436 nm, I have characterized three suitable dyes, 
that is, metanil yellow, tartrazine, and naphthol yellow S. In aqueous 
solution these have somewhat broad absorption peaks with maxima at 
435 nm for metanil yellow, 425 nm for tartrazine, and at 390 and 425 
nm for naphthol yellow S. The specular optical density (D) of each 
dyed plate at 436 nm is a linear function of the weight percent (C) 
dye in the dyeing solution in the low-concentration region of interest. 
The molar extinction coefficient and the D versus C relationship for 
each of the dyes at 436 nm are presented in Table I. 

The recommended procedure is to dye the plate by five-minute im- 
mersion in a gently rocking solution of C weight percent dye plus 0.02 
percent nonionic wetting agent. To maintain the initial plate quality, 
all solutions are filtered to remove particles larger than 0.1 pm, and 
the dyeing is carried out in clean hoods equipped with type-1A safe- 
lights. 


TaBLE I—Dye& ABSORPTION PARAMETERS AT 436 nm 








Dye E (liter/em—mole) Dyed Plate Density (D) 
Metanil yellow 2.12 X 104 11.5C + 0.3 
Tartrazine 1.77 & 104 1.70C + 0.3 


Naphthol yellow S 1.40 X 104 3.75C + 0.3 
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Fig. 7—Photomicrographs of high resolution test target images projected 
through an f/1.5 lens recorded in dyed and nondyed photographic emulsions at the 
indicated exposure times using 436-nm light. 
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To characterize the influence of the dye on the photographic response 
of the plate, monochromatic exposures of a series of plates of varying 
dye concentration were made through calibrated step tablets. The 
specular optical density of each step of the developed plates was then 
measured. The resulting family of characteristic curves for tartrazine- 
dyed plates indicated that both the speed and gamma, the slope of 
the characterize curve, of the system decrease with increasing dye 
concentration. For all further evaluation plates dyed in a 0.2 percent 
tartrazine solution were selected since their fourfold decrease in speed 
lies within the exposure capabilities of the step-and-repeat camera. 
These plates have a specular optical density of 0.64 at 436 nm which 
is sufficient to eliminate the necessity for an antihalation backing. 

A series of exposures of nondyed KHRP and 0.2 percent tartrazine- 
dyed KHRP were made in a test camera employing the f/1.5 lens 
and 436-nm illumination. Photomicrographs of the results of highest 
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Fig. 8—Density modulation versus spatial frequency of the test images recorded 
in the dyed plate (circles, 0.05-second exposure of Fig. 7) and the nondyed plate 
(triangles, 0.02-second exposure of Fig. 7). 


resolution are presented in Fig. 7. The bar widths in microns for the 
eleven 15-bar patterns in the second group on the EHaling* 3¢22-863 
test target at 10X reduction are (in the order in which they appear 
in the photomicrographs, clockwise from 6 o’clock): 5.0, 4.0, 3.15, 2.5, 
1.99, 1.58, 1.26, 1.0, 0.79, 0.63 and 0.50 wm. On qualitative comparison 
the dyed plates appear to resolve finer lines while at the same time 
providing better modulation of the low frequency bar patterns. 
Quantitative measurements of the apparent modulation improve- 
ment were carried out on the Ansco Model 4 recording microden- 
sitometer. The 0.02-second exposure of the undyed plate and the 0.05- 
second exposure of the dyed plate, selected as having the highest 
resolution on microscopic evaluation, were measured using a 20X, 0.4 
N.A. objective with a 5-ym illuminating slit and a 1-wm scanning 
slit in the microdensitometer. The results are presented in Fig. 8 


* The Ealing Corporation, Cambridge, Massachusetts. 
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in which the density modulation (M) is defined: 


Dis a Dike 
Dit: a Dix 


The averages were taken over the seven bars and six spaces at the 
center of each 15-bar pattern. The data demonstrate the improved 
modulation of the dyed plate at all frequencies plus the slightly 
higher resolution capability. 

The images thus formed in a dyed emulsion were used to control 
the exposure of a 0.2-um-thick film of KTFR photoresist by routine 
contact printing procedures resulting in usable 1-»m lines, demon- 
strating that the photographic image had sufficient developed optical 
density. Thus, all the speed and resolution requirements of the camera 
are fulfilled by these dyed photographic emulsions with the added 
benefits of increased modulation of low-frequency images and the 
elimination of the antihalation backing. At the time of this writing, 
commercial versions of this dyed high-resolution plate are coming on 
the market. 
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Device Photolithography: 


A Computer Controlled Coordinate 
Measuring Machine 


By F. R. ASHLEY, Miss E. B. MURPHY and H. J. SAVARD, Jr. 
(Manuscript received June 29, 1970) 


In development and operation of the mask-making laboratory, a precise 
positional measurement system is needed. This paper describes a system 
based on a Do-all coordinate measurement machine, controlled by a PDP-8 
computer. The computer handles all sequential operations as well as com- 
putation necessary for coordinate transformation and feature location. The 
result 1s a system which can measure an array of 208 points to an accuracy 
of +1 wm in less than two hours. Without computer control, measurement of 
such an array ts not feasible. 


I. INTRODUCTION 


In the design of the mask-making laboratory, the need for a precise 
positional-measurement system was recognized. This system is needed 
for alignment and adjustment of the primary pattern generator (PPG), 
the reduction cameras and the step-and-repeat camera. It is also 
needed for mask inspection. The measurement system should be at 
least ten times more precise than the tolerance on the masks being 
measured, and it should be capable of measuring a large number of 
points in a reasonable time. For example, the test pattern to align 
the PPG is an array of 208 points, and this should require no more 
than two hours to measure. Table I summarizes the requirements on 
the measurement system. 


1.1 System Description 

A Do-all Coordinate Measurement Machine (CMM) controlled by 
a PDP-8 computer forms the basis for the measurement system to 
meet these needs. This is shown schematically in Fig. 1. The Do-all 
machine consists of two air-bearing slides at 90° on black granite 
ways. The plate to be measured is mounted on one slide (x-axis) and 
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Taste I—PERFORMANCE OBJECTIVES Fork CMM System 





Field X 18 CM 

Y 22 CM 
Optical Power 250 
Projected Field 400 pm 
Feature Location +200 wm 
System Precision +0.08 wm 
Slew Rate (Both Axes) 0.5 CM/SEC 
Plate Measurement <2 hrs. 


Time (208 Points) 





& microscope with projection screen is mounted above the plate on 
the other slide (y-axis). The range of travel on the x and y axes is 
18 ecm and 22 cm respectively; by appropriate adjustment of the z- 
and y-axes slides, the microscope can be positioned above any point 
on a plate within an 18-cm by 22-cm range. Stepping motors move 
the x and y slides through taut wire capstan drives. Fine manual 
positional adjustment is provided for by two torque transmitter and 
receiver pairs on a separate control panel. Fringe counting inter- 
ferometers using a HeNe laser source provide precise positional infor- 
mation on the x and y axes. The counters display the total counts (1 
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Fig. 1—Block diagram of measurement system. 
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count = 0.0791 wm) that the x and y slides have moved from some 
predetermined origin. Plate features to be measured are optically 
projected onto a screen to allow the operator, operating as a feedback 
element, to trim the location of the plate feature with respect to a 
reticle on the projection screen. The interface provides an interaction 
channel between the CMM, the PDP-8 and the operator. 


HW. ADVANTAGES OF COMPUTER CONTROL 


The use of a general purpose computer as a control element for the 
measurement system has a number of advantages over a “hard wired”’ 
controller. First, there is the flexibility that the stored program allows. 
Modifications of the control functions and correction of errors are 
done by changing the program, not wiring. Very often this involves 
changing just a few instructions in memory and can be done right 
at the computer console in a matter of minutes. Second, the computer 
offers much greater input-output capacity; the system can be expanded 
to use disk or magnetic tape if required for future needs. The third 
advantage is that the computer can transform the coordinate system 
of a plate to the CMM coordinate system. This transformation can 
take into account (z) plate rotation with respect to the CMM, (12) 
the deviation of the angle between the x and y axes of the CMM from 
90° (skew angle), and (2) conversion of units of counts to metric 
units or address units of the plate. A fourth advantage is that the 
computer allows feature location on a plate. Since the computer is 
interfaced to read the x and y counters and to pulse the z- and y- 
stepping motors, a computer program can be written to position the 
CMM microscope over any desired point on a plate. 


II. MEASUREMENT SYSTEM DESIGN 


The two basic areas of design in the CMM system—design of the 
interface and program design—are discussed in the following para- 
graphs. 


3.1 Interface 

The interface was designed with simplicity as the objective, at the 
possible expense of more programming. This is feasible because of the 
high speed of the PDP-8 relative to the mechanical speed of the CMM. 
The block diagram for the interface is shown in Fig. 2. The interface 
allows the computer to read information (counter readings, control 
switches and data inputs) into its accumulator, and allows the com- 
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Fig. 2—Interface block diagram. 
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puter to transfer data from its accumulator to external devices. It also 
allows input-output transfer (IOT) pulses to be output by the com- 
puter. . 

The x- and y-axes stepping motors are driven by IOT pulses which 
occur in response to IOT instructions in the computer program. The 
direction of the step is controlled by the accumulator. One step of 
the motor causes a displacement of about 250 counts (20 ym) on either 
the z- or y-axis. The maximum rate of the motors is 200 steps per 
second giving a slew rate of 4 mm per second. 

The x and y counters are nine-digit counters with binary-coded 
decimal (BCD) output. Since one computer word is only 12 bits, it is 
necessary to provide a 36-bit storage register for each counter. This 
register is read into the computer in three 12-bit bytes. Care must 
be exercised in transferring the outputs of a counter to its storage 
register, in that the transfer must not occur when the counter is in 
a transition. This is taken care of by an update circuit which transfers 
the counter outputs to the register only at a fixed time delay after a 
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change in the least-significant bit (LSB). This time delay is chosen 
to be less than the time between input transitions to the counter. The 
routine to read one counter and store its BCD output in memory 
requires 33 ps. 

There are a number of manual controls available for the operator. 
These are operable only when the program is in the manual control 
mode (Fig. 3). These controls consist of switches whose state is 
sensed by the computer through the accumulator bus. A toggle switch 
MANL allows the operator to enter the manual mode. Push-button 
switches X+, X—, Y+, Y-—, allow the operator to manually position 
the microscope by pulsing the x- or y-stepping motors at a 200 pps 
rate. Push button switch CLEAR causes a present counter reading to 
be stored in core and then both counters to bet set to zero. Push 
button switch ouTpuT causes the present position in address units to 
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Fig. 3—Flowchart of control program. 
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be output on the teletype. Toggle switch Try allows the operator to 
input x-y coordinates in plate-address units; the MovE routine, 
described later, is then entered and causes the desired coordinates to 
be located. 


3.2 The Computer Program 

A simplified flow chart for the control program is shown in Fig. 3. 
The function of the control program is to provide a proper sequence 
of steps that will result in measurement of a plate. The slanted side 
boxes represent message lights that are illuminated when that point 
of the program is reached. The program progresses from one slanted 
side box to the next as the operator presses a continue push button 
on the control panel. The computer cycles in a loop while waiting for 
this operation. The manual mode can be accessed from any of the 
waiting loops, and is represented by the slanted side box in the center 
of Fig. 3. The boxes with curved tops and bottoms represent paper- 
tape input from the high-speed reader. In reading coordinates from 
the paper tape, a program XYINPT jis called to transform plate co- 
ordinates to CMM coordinates. This is accomplished by the following 
matrix equation: 


tm| _ ADFCT | cos (6 — ¢) —sin (6 — ¢) |x, : 
~ eos ¢ (1) 


y sin 6 cos 6 Up 
Figure 4 shows the plate and CMM coordinates systems. The rota- 
tion of the plate with respect to the CMM is denoted by 6, while ¢ 
denotes the skew angle. The quantity ADFCT is a constant to convert 
address units of a plate to counts of the CMM. 


ADIFCT = 8 N/) counts/address. (2) 


N is the address size in pm, and d is the wavelength (~ .6328 ym) of 
the HeNe laser. 

To compute positional errors, it is necessary to convert the CMM 
coordinates to plate coordinates. This is done by inverting the trans- 
formation (1) 


‘Lp = i cos @ sin (6 — g) |i t, 
= ADFCT| ; (3) 
ies —sin 6 cos (8 — ¢) ILYm 


3.2.1 Feature Location 


When the coordinates of a feature on a plate are known, a routine 
called Move can be used to cause the microscope to be positioned over 
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Fig. 4—CMM and plate coordinate systems. 


the feature. This is done through the stepping motors. Since one step 
is 20 pm, the positioning accuracy is 20 »m on each axis, which is 
well within the field of the microscope. Thus, to locate a feature, the 
following procedure takes place for each axis: 


() The desired counter reading is computed using equation (1) and 
is stored in the PDP-8 memory. 

(42) The quantity A, which is the difference between the desired 

counter reading and the present counter reading, is computed. 
(iz) If |A| < 500 counts, the feature is located and the procedure 
terminates; otherwise go to (i). 

(zv) A pulse is applied to the stepping motor; the sign of A determines 
the direction of the step. After a 5-ms delay (to allow the motor 
to complete its step) go to (i). 

Thus, while slewing to a position over a feature, the x- and y- 
stepping motors are pulsed continuously, and between pulses, the 
counters are read to see if the desired readings are obtained. The 
sequencing of MOVE insures that the first time the desired condition of 
being within 500 counts (40 pm) of the feature is obtained, then the 
motion of that axis is complete. No more pulses are output to that 
stepping motor. This eliminates hunting that would occur due to the 
time lag between application of a pulse to a stepping motor and motion 
of the slide. 
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IV. MEASUREMENT OF A PLATE 


The process of measurement of a plate is illustrated by the flow 
chart of Fig. 3. Initially, the microscope is positioned over a fixed 
point that is the CMM origin. Prior to measurement of a plate, a 
paper tape is made containing the distance from the CMM origin to 
to the origin of the plate coordinate system. Following on the tape are 
the coordinates of a reference point on the x-axis, followed by the co- 
ordinates of all points to be measured; these points are expressed in the 
plate coordinate system. This paper tape is placed in the high-speed 
reader of the PDP-8, and is read under program control. Also prior to 
measurement, three data-input thumbwheel switches are set. These 
contain the CMM skew angle in seconds of arc, the factor N which 
is the address size in microns and the three least-significant digits 
of the wavelength of the HeNe laser. The wavelength of the HeNe 
laser is represented to an accuracy of seven digits in the PDP-8 
—0.6328X XX pm, where XXX is read in on thumbwheel switches. 

During the preliminary steps of the measurement, the plate origin 
is located and both counters are cleared. Prior to being cleared, both 
counters are read and the negative of their readings are stored in 
the PDP-8 memory. This enables the program to return the CMM 
microscope to the CMM origin at the end of the measurement. The 
reference point on the z-axis is located next. This enables the PDP-8 
to compute the angle 0, 6 = tan (Yret/Uret), Where (Xret, Yret) are 
the coordinates of the reference point on the x-axis as read from 
the CMM counters. The program now has enough information to com- 
pute the coordinate transformations (1) and (3). The program then 
proceeds to measure points as they are read from the tape, printing 
the errors on the ASR 33. An end of tape character signals the last 
point to be measured and causes the programs to terminate with the 
microscope positioned over the CMM origin. 


Vv. CONCLUSIONS 


The measurement system as described has been successfully used 
to measure plates from the PPG, the 3.5X reduction camera and the 
step-and-repeat camera. The main limitation on accuracy is the 
ability of the operator to align the desired feature with the reticle of 
the microscope. This in turn depends very much on the line-edge 
definition. For example, features generated by the PPG have been 
measured with -+1-»m accuracy; the PPG generates line edges that 
are typically defined over a 5-ym distance. 
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The objectives on measurement time also have been met. A test 
array of 208 points for alignment of the PPG can be measured in 
less than two hours. Without computer control, the measurement time 
would be so long that drift problems and operator fatigue would make 
the measurement unfeasible. 
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APPENDIX A 


Skew Angle Measurements 


The skew angle ® as defined in Fig. 4 is the deviation from 90° of 
the « and y CMM axes. To measure 9%, a glass plate having three 
marks is used. An origin mark and marks on lines approximately 90° 
apart define the zp and yp axes as shown in Fig. 5. First the plate is 
placed on the CMM as shown in Fig. 5a with the axis of the plate 
roughly aligned with the CMM axis. Angles A and B may now be 
accurately measured by use of the CMM counters resulting in the 
following equation: 


A+ 90°+ = B+ 90° + op. (4) 


The measurement is then repeated with the plate rotated approxi- 





Fig. 5—Skew angle measurements. (a) Plate aligned with CMM axes. (b) Plate 
rotated 80° from CMM axes. 
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mately 90° with respect to the CMM axis as shown in Fig. 5b. Angles 
C and D are now measured resulting in the equation: 

C + 90° — 6 = D + 90° + Sp. (5) 
Equations (4) and (5) may be solved simultaneously to give the 
equation: 


ies Brose +2) (6) 


Measurements of this type indicate that ® is about 11.5 seconds of 
arc. @ must be measured after any disassembly of the y-axis support. 


Device Photolithography: 


The Mask Shop Information System 


By MRS. J. G. BRINSFIELD and S. PARDEE 
(Manuscript received June 17, 1970) 


The Mask Shop Information System (MSIS) is a set of computer 
tasks which exist in a specially designed multi-programming environment 
within a PDP-9 computer, and which control the flow of jobs through the 
new mask-making facility. The main functions of MSIS are to accept job 
descriptions and to assign tasks and pass data to the various shop facilities 
so that these jobs can be efficiently processed. In addition, MSIS keeps 
statistics on the progress and problems of the shop and issues reports 
both periodically and upon demand. 


I. INTRODUCTION 


A computer-based information and control system, referred to as 
the Mask Shop Information System (MSIS), assists in running the 
Bell Telephone Laboratories mask-making facilities at Murray Hill, 
New Jersey, and Allentown, Pennsylvania. In the planning stage for 
the new mask-making facility, it was realized that the scheduling and 
processing information required for efficient control of the flow of jobs 
would be too complicated to handle with paper work. Furthermore, 
keeping track of the large number of glass plates passing through the 
facility would be a problem. Thus, it was decided to develop MSIS. 
Briefly, MSIS controls the entire mask-making facility and serves as 
a repository of information on the status of each job, the location of 
each plate, and the performance of the overall facility. The equip- 
ment that makes up the new facilities has been discussed elsewhere.1* 

The first part of this paper will deal with the functions performed 
by MSIS and how the system appears to the user; the second part will 
describe the organization of computer programs and data required to 
implement MSIS. 


II. MSIS FUNCTIONS 


The significance of MSIS can best be grasped by reviewing the 
various functions that are performed. 
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2.1 Scheduling 


MSIS schedules each process step required to complete a mask. 
This scheduling is done on-line and allows for the inclusion of jobs 
of varying priority. As each task is completed by either a human 
operator or a machine, such as the primary pattern generator (PPG), 
MSIS determines the highest-priority task waiting and assigns it to 
the operator or machine that is idle. 


2.2 Control Information 


MSIS transmits control information over wide-band data links to 
the two control computers attached to the PPG and the step-and- 
repeat camera. For the PPG, this information indicates the magnetic 
tape reel number and the file number within that reel that contains 
the information describing the artwork to be generated next. For the 
step-and-repeat camera, the identification of the specific reticles needed 
to make a particular mask as well as the step-and-repeat array in- 
formation is transmitted from MSIS to the control computer. Other 
control information is transmitted directly to human operators via spe- 
cial displays and teletypewriters. 


2.3 Information Storage 


MSIS maintains an extensive disc file containing information such as 
(t) the status of every job in process, 

(it) performance statistics covering each process step as well as the 
overall mask-making facility, 

(it) inspection information required to define special mask features 
that should be inspected in detail, and 

(iv) the step-and-repeat array information necessary to define a 
complete mask for silicon circuits. 


2.4 Glass-Plate Handling 


To avoid the confusion of human operators sorting through a moun- 
tain of glass plates to find a specific reticle, piece of artwork, or mask, 
MSIS assigns each piece of glass to a numbered slot within a numbered 
carrier. The location of each piece of glass is remembered so that when 
it is needed as the input to another process step, its exact location can 
be supplied to the operator. 


2.5 Inquiries and Reports 


MSIS will allow certain on-line inquiries to be made from a tele- 
typewriter terminal. Some on-line inquiries might be: 
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(4) What is the status of my job? 
(77) What is the backlog of work for the reduction cameras? 
(21) How many pieces of artwork have been generated this shift? 


In addition to these short on-line inquiries, more detailed management 
reports will be generated by MSIS on a daily, weekly, monthly, or 
quarterly basis. Certain of these management reports will also be 
available on demand. 


III. USER/COMPUTER INTERFACE 


There are two sets of users that must interface with MSIS. The first 
user is the engineer or designer who wishes to request that a particular 
set of masks be manufactured. The other is the mask-shop operator 
who must exchange information with the computer system while com- 
pleting his job. 

The engineer or designer will communicate his needs to MSIS by 
a set of instructions on punched cards that can be included in his 
XYMASK input deck or can be submitted separately with the post- 
processed XYMASK tape that is required. Figure 1 shows an example of 
these instructions for a typical set of masks for a silicon circuit. Both 
tantalum- and silicon-circuit masks can be handled with equal ease; 
but a silicon circuit is used in this example because in general it requires 
that more information be supplied. 

The first two cards contain standard identifying information; an 
engineer might have a number of these cards duplicated to have when 


JOB DESCRIPTION 


ENGINEER MH, 1112, B65420, J.H. GILMBRE X5023 
CASE 39500-20 
DEVICE A1502, BEAM LEAD GATE 


MASK B850122-1-4, 2135, ARRAY-L2 


PATTERN B413622-1-2, ART 
PATTERN  1100600-1-3 
PATTERN ~—_L200501-1-2 


MASK 


END 


Fig. 1—Job-description information. 
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needed. The third card identifies the circuit and is used primarily on 
reports for ease of identifying the jobs. A MAsK card is included for each 
mask of the complete job. This mask card contains the drawing-level- 
issue number of the particular mask and an optional process number. 
The process number will be imaged by the step-and-repeat camera onto 
the final mask for use during circuit fabrication. The final field on the 
MASK card indicates, in this case, that a prestored step-and-repeat array 
(L2) is to be used in making the mask. It is hoped that most masks can 
be specified using one of a number of prestored array definitions. For 
those masks that require special arrays, a means is provided for defining 
the array along with the job description. In the example of Fig. 1, assume 
that three patterns are required to complete the desired mask, that is, 
the primary pattern for which artwork must be generated and two 
standard test patterns (L100600-1-3 and L200501-1-2) for which reticles 
already exist. Similarly, for each mask that makes up the job a MASK 
and three PATTERN cards would be required. 


3.1 Inspection Data 


If special features on a mask are to receive specific inspection, a 
series of cards, as shown in Fig. 2, can be included. These cards indicate 


(i) the coordinates of a fiducial mark; 
(72) the tolerance, in microns, to be maintained; 
(iii) the coordinates of a feature and its desired width; 
(iv) the coordinates of a feature and its desired height; or 
(v) the coordinates of two vertices of a feature whose edges do not 
parallel the X and Y axes. 


At inspection time, MSIS scales this information appropriately, de- 
pending on the inspection being carried out, and presents the scaled 
information to the inspector. 


INSPECTION DATA 


MARK —-100, 0 

TOLERANCE .5 

INSPECT 100, -—200, W, 52 
INSPECT -500, —150, H, 10 


INSPECT -40, 100, -50, 200 


Fig. 2—Mask-inspection information. 
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3.2 Operator Interface 


Operator-to-computer communications is carried out by two different 
means. Operators involved with the reduction cameras, contact printing, 
chrome etching, and the step-and-repeat camera will communicate 
with the system via a combination of cathode-ray-tube displays and 
keyboards. Administrative information and inspection data is com- 
municated via standard KSR35 Teletypewriters. 


3.3 Secondary Information Strip 


Another medium for conveying information required by both the 
users and the computer is the secondary information strip. Figure 3 
shows the relationship of this strip to the primary artwork as it comes 
from the PPG (not drawn to scale). Two items of information are 
placed in the strip in both human- and machine-readable form. These 
are, the drawing number (for example, B123456-4-3) and the magnifica- 
tion that was used in drawing the artwork (for example, 35). The human- 
readable portion is intended to allow the operators to verify visually 
that they have the proper piece of glass or are using the proper reduction 
camera. The machine-readable representation is repeated twice as a 
series of coded clear and opaque spots. One set of coded information 
can be read by an array of photo-diodes mounted in the reduction 
camera. The other set is imaged onto the reticle produced by the reduc- 
tion process and can be read by photo-diodes in the step-and-repeat 
camera. In both cases, MSIS uses the machine-readable information 
to insure that the proper artwork, or reticle, is mounted in the proper 
device before allowing a job step to proceed. 

Across the top of the artwork is another piece of encoded information. 
This represents the particular PPG on which the artwork was manu- 
factured, and a sequential serial number. The MSIS does not make use 
of this latter information. 


IV. EQUIPMENT CONFIGURATION 


Figure 4 shows the overall equipment configuration for the mask 
shop. In the center of Fig. 4 is the MSIS main computer, a Digital 
Equipment Corporation PDP-9. The characteristics of this machine 
and its associated hardware are shown in Table I. The MSIS computer 
is interfaced via high-speed data links directly to the control com- 
puters associated with the PPG (PDP-9) and the step-and-repeat 
camera (PDP-8). Two model 35 KSR Teletypewriters are connected 
to the system. One is for administrative purposes and the other for 
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Fig. 3—Secondary information strip. 


use by the inspectors. Three keyboard display positions are also con- 
nected to the system. Each position consists of a Tectronix 611 Stor- 
age Display and a 16-position keyboard. These are used to com- 
municate with operators in the reduction, contact-printing, and 
chrome-etching areas. Additional keyboards and displays are connected 
to the two control computers. 


Vv. A TYPICAL JOB 


To help understand the functioning of the MSIS, it would be in- 
structive to trace a typical mask as it flows through the system. A 
silicon-circuit mask will be used as the example (although the sys- 
tem is designed to handle both silicon- and thin-film-circuit masks) 
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Fig. 4—System configuration. 


since it uses more facets of the system. Figure 5 shows the flow of 
the mask through the system. 

When an engineer feels he has adequately debugged his circuit 
masks using XyMASK, he will add the job description cards described 
earlier to his xyMASK deck and make a final computer run to gen- 
erate a computer tape for use by the PPG. This tape will also 


MSIS/PDP-9 HarpWaRE CHARACTERISTICS 





TABLE I 





Devices Characteristics 





Core Memory 8K words, 18 bits, l-us cycle time 





Disk Memory 1 million words, 17—ms average access time 








Magnetic Tape | 9 track, IBM compatible, 30A-character/second 





Paper Tape 8 nee 300-characters/second reader, 60-characters/second 
pune 
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Tig. 5—Trace of a typical mask as it flows through MSIS. 


contain the job-description information. The reel of magnetic tape is 
submitted to the mask shop and is mounted on the magnetic-tape unit 
attached to the MSIS. The job-description information is read by 
the MSIS and an instruction is given to save the reel in a particular 
numbered bin. The designations of the desired masks are placed in a 
queue awaiting the PPG. Their position in the queue is based on the 
priority assigned to the job. 

When the mask in question reaches the top of the PPG queue, a 
message is sent from the MSIS to the PPG control computer indicat- 
ing the reel number, bin, and file within the reel where the xyMasK 
data can be found for the desired mask. When the artwork has been 
completed, the PPG control computer transmits an appropriate mes- 
sage to the MSIS. The mask designation will be removed from the 
PPG queue and assigned to a table of those masks undergoing photo- 
graphic development. 

No attempt is made to schedule or control the work as it passes 
through the photographic development area. After completing photo- 
graphic development, the artwork is passed to inspection and it is 
“logged in” at a teletypewriter. The MSIS instructs the operator to 
place the artwork in a particular numbered slot of a numbered carrier. 
At the same time the mask designation is placed on the inspection 
queue. When it reaches the top of the queue, the inspector is told by 
the MSIS where to locate the artwork and what unique features should 
be inspected. 

Assuming the artwork passes inspection, the operator signals MSIS 
via the teletypewriter and the mask designation is placed on the reduc- 
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tion queue. Again the artwork is assigned to a unique slot. When the 
mask again reaches the top of the queue, a message is displayed to the 
reduction camera operator to mount the artwork in a particular reduc- 
tion camera. The MSIS checks to insure that the proper artwork is 
mounted in the proper camera by scanning the secondary information 
strip. If it is properly mounted, the MSIS initiates exposure. The 
exposed reticle is then passed to photographic development. 

The development-and-inspection cycle is repeated again, and the 
reticle is passed to the step-and-repeat camera. When the mask desig- 
nation reaches the top of the step-and-repeat queue, the MSIS trans- 
mits all the step-and-repeat array information, including all reticles 
required, to the step-and-repeat control computer. Again at each stage 
of the step-and-repeat process the secondary information strip is 
checked to insure that the proper reticle has been mounted in the 
camera. 

Upon completion of the step-and-repeat process, another develop- 
ment-and-inspection cycle occurs with the mask being passed to the 
print area. At the proper time, the number and type of prints required 
are displayed to the operator. After the prints have been made, a final 
development-and-inspection cycle occurs and the finished masks are 
available. 


VI. MSIS SYSTEM PROGRAM STRUCTURE 


During the early design stage of MSIS, it became obvious that: 


(t) To keep and continually update the data required to process 
jobs in the shop, such a large number of disk-memory accesses 
would be required that the system performance would be limited 
by the ability to read from disk memory. 
MSIS would be continually receiving requests for service either 
directly or indirectly from about 12-15 shop operators and would 
have to answer within reasonable human-response times. Fur- 
thermore, due to the difference in characteristics of the shop 
facilities, some of this communication could be handled via 
speedy interfaces such as data links while other input/output 
would have to be handled at relatively slow teletypewriter 
speeds. 

(222) To allow demand and periodic reports on shop progress and 
periodic checks of data to prevent potential problems, there 
would have to be some programs with long processing times 
included in the system. 


~—S 


(ii 
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It was decided that the best computer system for solving these problems 
would be a multi-programming system with a task-priority scheme. 
Since such an operating system did not exist for the PDP-9, the pro- 
gramming of a monitor had to be included in the MSIS project. 

With the present MSIS multi-programming monitor, execution of 
one program can go on simultaneously with block transfers of data to 
and from disk for another program. Furthermore, by keeping those 
programs that have long processing times or that use slow input/ 
output devices in separate execution areas from faster running pro- 
grams, it is possible to provide quick operator responses and still run 
lengthy programs. Using a task-priority scheme, those tasks* which 
must, provide quick response can be given high priority and thus pro- 
cessed much more quickly than the slower report and data-checking 
programs. 

With only 8K words of core memory, the luxury of a complex monitor 
that allows dynamic allocation of execution areas and relocatable 
programs cannot be afforded. Thus the core memory is divided into 
fixed execution areas. Each task is assigned to an execution area ac- 
cording to its characteristics. Another restriction used to simplify the 
monitor is that swapping of tasks in the midst of execution is not per- 
mitted. That is, once a task is in execution in an execution area, no 
other task that requires the same area can be executed until the present 
task is completed. 

The layout of the PDP-9 core memory is illustrated in Fig. 6. The 
first 3300 words are taken up by the monitor. Approximately the same 
amount of space is divided into 6 execution areas for task processing. 
The remainder of core memory is used as a ‘‘common” area to provide 
communication of data between the tasks. 


6.1 Monitor 

The monitor comprises three main modules: the task sequencer, 
interrupt handler and input/output control: 

The heart of the monitor is the task sequencer; its basic function is 
to determine the sequence in which the tasks should be executed. The 
relative priority of the various tasks that make up the MSIS is deter- 
mined by their relative order as they appear in the task list. 

Whenever it is necessary to determine which task is to be executed 
next, the task sequencer scans the task list from the beginning. It 
searches for the first task which is activated, which belongs to an 


*Throughout this paper “task” is used to indicate a collection of computer 
subroutines that perform a particular function. These tasks are usually stored on 
disk memory and brought into core memory only when needed, 
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Fig. 6—MSIS PDP-9 core memory layout. 


execution area that is not already in use by another task, and which is 
not still waiting for the completion of an I/O transfer. If the chosen 
task has already started execution, the task sequencer restores the 
registers, and returns to the place where it was interrupted. If the 
selected task is ready to start from the beginning and in core, the task 
sequencer transfers to its starting location. If the task has yet to be 
read in from disk, the sequencer calls I/O control to perform the 
disk transfer and continues its search. 

The interrupt handler executes whenever an interrupt occurs as the 
result of the completion of an I/O transfer or an overflow of the real- 
time clock. The interrupt handler immediately saves the registers for 
the program in execution, and then decides what caused the interrupt. 
If the interrupt was caused by a special keyboard, a reduction-camera 
signal, a data link, or a request for attention (carriage-return) from 
one of the teletypewriters, the interrupt handler sets up some common 
data words and activates the task required to handle the input. . 

If the interrupt was one of a series of interrupts that occur in the 
process of completing an I/O transfer (such as the transfer of one char- 
acter of a teletypewriter message), the interrupt handler stores the 
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data and/or sets up transfer of the next data, and updates some com- 
mon words to keep track of the status of the overall message. If the 
interrupt marks the end of a transfer (such as the last column of a 
card), the interrupt handler sets up the appropriate common words 
and activates the I/O control program. When all interrupts have been 
handled, control is given to the task sequencer. 

The purpose of the I/O control program is to make I/O transfers 
via the card reader, magnetic tape, disk, paper tape, and both teletype- 
writers appear to the application tasks as fully buffered operations 
that can be handled immediately through subroutine calls. Actually, 
fully buffered I/O occurs only with the magnetic tape and disk. And 
when I/O control is called by an application task, it simulates an 
interrupt and locks the task from execution so that the task sequencer 
will allow other tasks to execute while the data transfer is taking place. 
Input/output requests for busy devices are queued and initiated as 
soon as the device is free. Five retries are made when disk-and-tape 
parity errors occur. For all other errors, an error is returned to the 
application task. 


6.2 Execution Areas 

There are six execution areas for the MSIS application tasks. Their 
size and arrangement are illustrated in Fig. 6. 

Execution area I contains the highest-priority disk tasks. These are 
tasks which lower-priority tasks activate to accomplish activities re- 
quiring an update of shared data blocks that are stored on disk. Since 
all tasks that are allowed to update these shared data blocks at 
crucial times are included in execution area I, they can never be in 
execution simultaneously and thus no updates can be lost due to “race” 
conditions between two tasks. The scheduler task, which handles all 
queue manipulations, plate carrier assignments and job-status updates, 
is located in execution area I. The allocate task, which dynamically 
allocates and restores disk space, is also located in execution area I. 

All low-priority disk tasks use execution area II. These are tasks 
which can afford to wait for their execution area to be free without 
appreciably slowing up the processing of tasks. There are 25 tasks pres- 
ently assigned to this area whose functions include the following: enter- 
ing and deleting jobs from the system, initializing tables at the begin- 
ning of a shift, asking for plate carriers to be moved between facilities, 
listing the contents of a plate carrier upon request, asking for shop 
output to be delivered to the engineer, and reporting on shop progress 
and shop problems. 

Disk tasks which have medium priority use execution area III. A 
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medium-priority task is one which, in general, doesn’t have another 
task waiting for its completion, but which does have an operator wait- 
ing for a response. The tasks which use this execution area include 
those which control the task assignments for the PPG, reduction 
cameras, step-and-repeat camera, and inspectors. 

Execution areas V, VI, and VII contain small, high-priority in-core 
tasks. One of these tasks provides a check against the failure of an 
I/O device to respond, which would cause a tie-up in the system. The 
other two tasks accept requests from the two teletypewriters and de- 
code the messages to decide what disk task should be activated to 
handle the request. 


6.3 Common 

As has already been pointed out, most of the common data area 
is used to pass data between tasks. Another use for this common data 
area is to allow a task to save crucial data from one execution time to 
another without requiring the data to be stored on disk. For instance, 
the scheduling task saves the top of each of its facility queues in core 
so that it can perform most queue manipulations without taking the 
time to access disk. Also, the facility control tasks save the description 
of the mask presently in process in core because the control task is 
usually activated several times before passing one mask through the 
facility. 


VII. MSIS DATA STRUCTURE 


The bulk of the MSIS data is kept on disk in data sets called descrip- 
tion blocks. While a particular description block is part of MSIS, its 
location on disk remains constant so that a “pointer” to its location on 
disk is a unique and unchanging number. These disk pointers are used 
to set up a structure of rings and linked lists that unite the data for one 
job even though the data is not in one contiguous area. In this way, 
disk segments (64 words) can be allocated in a random fashion, avoid- 
ing the problem of collecting a contiguous data area large enough for 
a particular job. The disk pointers are used in tables that correlate 
the data with names that have meaning to the shop operators so that 
teletypewriter requests for specific data can be made. 


7.1 Description Blocks 

There are five kinds of description blocks: job-description block 
(JDB), mask description block (MDB), pattern description block 
(PDB), inspection description blocks and step-and-repeat-array de- 
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scription blocks. The job, mask, and pattern description blocks have a 
fixed length of one segment. The inspection and step-and-repeat-array 
description blocks can be any number of linked segments. 

When a job is entered in the MSIS, its processing data is split into 
description blocks. As the job is processed, additional data is added to 
these blocks and this data can be retrieved at any time. 

The data common to all masks of a job, such as the engineer’s name 
and the circuit code, are kept in a JDB. The data common to all pat- 
terns of a mask, such as the mask-identification number and the pres- 
ent status of the mask, are kept in an MDB. The data particular to one 
pattern of a mask, such as the pattern-identification number and the 
current location of the glass plate, are kept in a PDB. 

As explained in Section 3.1 of this paper, the engineer may specify 
particular features of a mask that he wants inspected. All the inspec- 
tion information for one mask or pattern is kept in an inspection 
description block. 

A description of the array of patterns to be used in making a mask 
is kept in a step-and-repeat-array description block. The placements of 
each pattern in terms of X and Y coordinates are given in micron 
dimensions. The pattern names may be given in general form accord- 
ing to the order of the PDB. In this way, one step-and-repeat-array 
description block can be used by many masks as mentioned in Section 
TII of this paper. 


7.2 Job-Data Structure 

All the data for one job are linked together by a ring and linked 
list structure. The ring structure is used to unite the job, mask, and 
pattern description blocks as illustrated in Fig. 7. An inspection de- 
scription block consists of a linked list of segments; the pointer to 
the first of these segments is placed in the description block of the mask 
or pattern described by the inspection data. The step-and-repeat-array 
description block is also a linked list of segments; the pointer to the 
first of these segments is placed in all MDBs using this array descrip- 
tion. 

With this data structure, data for one job may be scattered in ran- 
dom fashion over the disk and yet one pointer to any one of these 
segments can Jead to all the data for the job. Using this arrangement, 
data can be easily added to or deleted from a job description. Further- 
more, no data structure rules cause limitations to be placed on the 
number of masks in a job, the number of patterns in a mask, the 


MASK SHOP INFORMATION SYSTEM 2217 


JOB 
DESCRIPTION 
BLOCK 
MASK MASK MASK MASK 
DESCRIPTION DESCRIPTION eves DESCRIPTION DESCRIPTION 
BLOCK BLOCK BLOCK BLOCK 


PATTERN PATTERN PAT TERN 
DESCRIPTION DESCRIPTION eeeo DESCRIPTION 
BLOCK BLOCK BLOCK 


Fig. 7—Description block ring structure. 


number of critical features to be inspected, or the complexity of an 
array description. . 


7.3 System-Data Structure 


MSIS uses a table structure to connect the external world with its 
data. MSIS can retrieve all information about a particular job (if the 
request for the data includes the job number) by referencing a table 
which correlates the job number with the JDB pointer. Furthermore, 
the status and location of any glass plate in the shop, whether it be 
a piece of artwork, a reticle, a master mask, or a working copy can 
be obtained through a table which correlates plate identification num- 
bers with the appropriate description block. 

Also, the facility queues need contain only a one-word description 
block pointer for each entry on the queue because all the data needed 
when an operator or a facility requests a new assignment can be ob- 
tained with the use of that pointer. 

Shop statistics are kept in status tables on permanent disk seg- 
ments. The figures in these tables are continually updated by the appli- 
cation tasks. Thus when system statistics, such as the number of jobs 
in process, or the average time to get a job through the shop, are 
requested by a demand report, the answer is immediately available. 


VIII. SIMULATION 


Concurrent with the design of MSIS, a program which simulates the 
flow of jobs through the new mask-making facility was written using 
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the IBM General Purpose System Simulator. The predictions made 
with this simulator were used to set upper limits on table lengths, to 
design the scheduling algorithm, and to design the MSIS—inspector in- 
terface. 

With a fairly good knowledge of the processing times at each facility 
and the number and type of jobs that would pass through the shop on 
an average day, it was possible to set up a reasonably accurate simula- 
tion of the shop. However, two items were particularly hard to de- 
scribe: the length of time required to inspect a plate and the rejection 
rate for inspected plates. Based on an earlier generation mask shop 
which existed at Allentown, some figures were obtained. However, two 
of the main purposes of the new shop were to provide more reliable 
making of plates and a better inspection facility. Thus these figures 
were considered worst-case values. An educated guess was used to 
establish a more likely set of numbers. Using both sets of inputs and 
varying the load on the shop, a large number of simulation runs were 
made. The results of the simulation for a load of 25 masks/shift are 
summarized in Table II. 


8.1 Table Lengths 

With a million-word disk, it was possible to be safe and allow extra 
space for table lengths. Nevertheless, some numbers were needed to 
decide just what “safe” meant. Here the simulation was invaluable. 
By having numbers for the maximum number of jobs in process and 
the total number of plate carriers used, it was possible to define an 
upper limit for the plate and carrier tables. 


8.2 Scheduling Algorithm 

The scheduling algorithm maintains a queue of masks that are await- 
ing processing by each of the facilities that make up the mask shop 
(e.g., PPG, reduction camera, etc.). One problem in designing the sched- 
uling algorithm was the setting of a limit on the size of a facility 
queue. If a facility queue was allowed to be indefinitely long, the 
scheduling program would be cumbersome. Even if the queue lengths 
were set at a large number, the queues would have to be located on 
disk and the number of disk accesses involved in the continual queue 
manipulations would be too time-consuming. However, the possibility 
of a facility breakdown and a queue build-up prevented the placing of 
a tight restriction on queue length. 

Thus it was decided to have a set of in-core facility queues that 
would be large enough for use during normal shop operation and queue 
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TABLE II—INspEctTion RESULTS 





Long Inspection Times—| Short Inspection 
High Rejection Rates Times—Low 
Rejection Rates 





Shop through-put 22 masks/shift 24 masks/shift 
Turn-around time (normal) 3 days 13 days 
Turn-around time (priority) 1$ days 1 day 
Average length of a queue 8 5 
Maximum number of jobs in process 56 40 

Total number of carriers used 48 32 
Average percent of time 8 inspectors 


were busy 85% 35% 


extensions on disk to be used during abnormal operation. Using the 
results of the simulation, the number of in-core queue entries was 
established as ten; with this number, most queue manipulations can 
take place without disk accesses. 

Other decisions that had to be made in designing the scheduling 
algorithm were whether a first-come, first-served system would be ade- 
quate, whether priority jobs should be allowed in shop operation, and, 
if so, what the number of priority levels should be. The simulation was 
used to test the possibilities. It was discovered that if the shop is keep- 
ing up with the input load (this occurs at a load of 25 masks/shift), 
most jobs can be completed in a couple days. Also, if two levels of 
priority are used for jobs entering the shop and the number of high- 
priority jobs is limited to 5 percent, a high-priority job can be com- 
pleted in one day. Thus a simple two-level priority scheme is consid- 
ered adequate, at least for a first version of MSIS. 


8.3 Inspection-Station Design 


For most of the shop facilities, the hardware defines the number of 
jobs that can be in process at one facility at one time and thus no 
facility limitation problems were encountered in the design of MSIS. 
However, the inspection facility is limited only by the number of in- 
spectors and the capacity of the communication device. Considering the 
length of time required to complete one inspection process, it was real- 
ized that one teletypewriter could readily service five to ten inspectors. 
And, if necessary, another teletypewriter could easily be added to the 
MSIS hardware. 

The number of inspectors allowed was critical to the design of the 
inspection control task and the allocation of core area for description 
blocks for masks in process at the inspection facility. Thus the simula- 
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tion was invaluable again. The results showed that the number of 
inspectors could be limited to seven without impairing shop efficiency. 


IX. MSIS FUTURE 


At present, MSIS is controlling the making of masks at Murray Hill 
in a shop that is using one PPG and two reduction cameras to handle 
a small load of work. When the contact printing facility becomes avail- 
able at Murray Hill, MSIS will also assign tasks in this area. The 
information system will then be installed in the Allentown shop; at this 
time, the communication with the step-and-repeat camera will be in- 
cluded. Before the end of the year, MSIS will be controlling the opera- 
tion of both the Murray Hill and Allentown shops. 

It is too early to know what problems will be encountered during a 
long period of shop operation with MSIS. Realistically, it must be 
assumed that even with the simulation, some unexpected demands on 
the system will turn up and some additions and changes will be re- 
quired. However, the monitor is sufficiently general that it is doubtful 
that it will undergo any major change. And the method of communica- 
tion between tasks, the description block data arrangement, the hand- 
ling of queues, the plate-carrier assignments, and the allocation of disk 
areas are basic enough to remain permanent. Since the coding for these 
functions has been kept in separate tasks from those dealing directly 
with the outside world, these tasks can be kept intact even when major 
shop changes take place. Thus a change in shop operation will probably 
lead to the rewriting of a task or two, and it will be possible to add the 
new tasks without causing havoc to the rest of the system. 

This modular arrangement of system functions will also work out 
well when MSIS expands. A study is now underway to see how the 
adding of a data link between the coordinate measuring machine in 
the inspection area and the MSIS computer will improve shop opera- 
tion. It is believed that with a new inspection task added to the MSIS 
task list and a small addition to the interrupt handler, this improved 
inspection capability can readily be added to MSIS. 
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Response of Periodically Varying Systems to 
Shot Noise—Application to Switched 
RC Circuits 


By S. O. RICE 


(Manuscript received June 16, 1970) 


This paper is concerned with the statistical properties of the output 
y(t) of a periodically varying linear system when the input 1s random 
shot notse. 

Usually y(t) can be divided into a noise part, yy(t), and a periodic part, 
Ypor(t). Hxpressions are obtained for the Fourier components of Yper(t) 
and the power spectrum of yy(t). Various averages associated with y(t) 
are studied. Some of the results for shot noise input can be converted into 
corresponding resulis for white noise input. 

Some of the theoretical results are illustrated by applying them to two 
examples. In both examples the system consists of an arrangement of a 
resistance, a condenser, and a switch which opens and closes periodically. 
The output is the voltage across the condenser. 


I. INTRODUCTION 


Consider a circuit, shown in Fig. la, consisting of a resistance R 
shunted by a switch and condenser C’.. The circuit is driven by a Pois- 
son shot noise current. The elementary charges g arrive at random at 
an average rate of v per second. The switch operates in a cycle with 
period 7. It is closed during the intervals nT < t < nJ’ + aT and 
open during the intervals nT + eT < t < (n + 1)T where n is an 
integer and 0 < a < 1. We are interested in the statistical properties 
of the voltage V(t) across the condenser. In particular, we want an 
expression for the two-sided power spectrum Wy(f) of V(t). 

This problem was encountered by D. D. Sell* during the develop- 
ment of a new type of spectrophotometer. The determination of an 
exact expression for Wy(f) turned out to be unexpectedly difficult, 
and led to the present investigation of the more general case in which 


2221 


2222 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1970 










INFINITE IMPEDANCE 


fe) SHOT NOISE 


CURRENT GENERATOR | 










INFINITE IMPEDANCE 
SHOT NOISE 
CURRENT GENERATOR | 





[i say oe Fe 


(b) 


Fig. 1—RC circuits with periodically operating switch. 


the switched RC circuit is replaced by a general linear network which 
varies periodically with time. 

The systems shown in Fig. 1 are “cyclo-stationary” (this term was 
introduced by W. R. Bennett). Cyclo-stationary systems have been 
studied by a number of writers. A detailed treatment and many ref- 
erences are given by H. L. Hurd? in his thesis on periodically cor- 
related stochastic processes. However, I have been unable to find any 
references dealing specifically with periodically varying systems hav- 
ing shot noise input. The nearest approach is contained in seven pages 
of anonymous handwritten notes* obtained by Sell. These notes give 
approximate results for the case of Fig. la with white-noise input 
instead of shot noise input. 

In Section IJ, we make some general remarks about the notation 
and type of analysis used in this paper. Section III contains a state- 
ment of results for the general system shown in Fig. 2. In Sections 
IV and V, the general results are applied to the RC circuits shown 
in Figs. la and 1b. Representative curves giving Wy(f) for various 
values of the circuit parameters in Fig. la are plotted in Fig. 3. Sec- 
tions VI, VII, and VIII contain the derivation of the expressions stated 
in Section II for the various ensemble averages and the output power 
spectrum. The results for shot noise input can be carried over into cor- 
responding results for white gaussian noise input. This correspondence 
is developed in Section IX. Appendix A gives an outline of the 
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analysis required in applying the general theory to get the power 
spectrum Wy(f) of the output V(t) in the RC circuit of Fig. la. 

Roughly speaking, the shot effect formulas for a periodically vary- 
ing system differ from the shot-effect formulas for a time invariant 
system‘ by containing an additional integration. This extra integral 
represents an average taken over the period. 


Il. REMARKS CONCERNING NOTATION AND ANALYSIS 


“In this paper ensemble averages are denoted by the angle bracket 
( ) and time averages by over-bars. For example, consider V(t) in Fig. 1. 
We can write V(t) = V(t, ¢) where ¢ represents the family of random 
arrival times of the charges g comprising the shot noise current. When 1 
is held fixed, V(t) can be regarded as a random variable and (V’(é)) 
as the average value of the /th power of V (é) at time ¢. On the other hand, 
for a fixed set ¢ of arrival times, i.e., for a particular member of the 
ensemble, the time average of V'(é) is denoted by 


T1 
ost / V'(f) dt. (1) 
T1790 T; 0 
Let z(¢) be an output function (e.g., V’(t)) of our periodic system such 
that its ensemble average (z(t)) is periodic with period T, the period of 
the system. We assume that the time average z(t) has the same 
value for almost all members of the ensemble. From this assumption 
and the periodicity of (z(t)) it follows, upon averaging both sides of 


the equation 


a 1 T, 
iy Sn | a(t) dt 
0 


Ty 0 1 


over the ensemble, that 


= 3 [ copa. (2) 


In addition to ensemble and time averages, we shall use # to denote 
expected values of time invariant random variables associated with 
the amplitudes of the shot noise impulses. 







LINEAR SYSTEM 


h(t, 7) 





Fig. 2—Time-varying linear system specified by h(t, 7). 
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Fig. 3—Power spectrum of V(é) in Fig. la. 


Wry J ) = 2-sided power spectrum of V(t) minus DC spike due to Vac = v gh. 
I(t) = S.qé(t — th), » = Arrival Rate, y = 1/(RC); 
a = Fraction of time switch is closed; 
T = Length of switch cycle; 
(a, yI) = Curve parameters. 


? 


We use the term “periodic” to mean “singly periodic.” The more 
difficult case of “multiply periodic” variation is not considered. An 
example of the latter is given by the circuit of Fig. la in which the 
switch is operated by the function f(t) = P cos pt + Q cos qt, p and 
q being incommensurable. The switch is closed when f(t) > 0, and is 
open when f(t) < 0. Possibly such cases could be handled by the 
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method used by Bennett® to obtain the output of a rectifier when P cos 
pt + Q cos gt is applied. 

The (two-sided) power spectrum Wy(f) [where Wy(—f) = Wr(f)] 
can be interpreted physically as follows. Let V(t) be applied to an 
ideal filter which passes only the narrow band fi < |f| < fi + 4f, 
and let the filter be terminated in a resistance of one ohm. Then 


[Wy(—f:) + Wrf,)] Af = 2W v(f1) Af 


is the time average of the power which would be dissipated in the 
one ohm resistance. The average must be taken over an interval long 
in comparison with 1/Af. 

The analysis used here makes no attempt at mathematical rigor. 
Orders of summation and integration are interchanged freely, and 
assumptions are made which are physically plausible but which may 
be difficult to express in precise mathematical terms. 


III. STATEMENT OF RESULTS FOR GENERAL SYSTEM 


The results given in this section pertain to the general system shown 
in Fig. 2. The system is linear and is specified by the response y(t) 
= h(t, r) to a unit impulse x(t) = 8(t — +r) applied at time r. The 
system varies periodically with period T so that 


hé+nT,7r+ nT) = hit, 7) n = integer. (3) 


In most of our work, the input x(t) is the shot noise 


ao 


a(t) = ps a,5(t — t,) (4) 


where the random “arrival times” ¢, occur at an average rate of 
v/second and constitute a Poisson process. The impulse amplitudes a, 
are independent random variables with 


E(a,) = Ea), E@) = E@’). (5) 
Since the system is linear, the output corresponding to equation (4) is 


yf) = 2 a,h(t, ty). (6) 


The function h(t, +) 1s assumed to be such that the steps in the 
analysis are legitimate. In particular, it is assumed that when 0 < 
7 <T and |t| > ~, |A(t, r)| tends to 0 with sufficient rapidity to 
(1) make the various integrals converge, and (2) ensure that the 
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times at which a long interval of operation begins and stops have 
no appreciable influence on the output during the major portion of 
the interval. 

In Section VIII it is shown that y(t) is the sum of a noise com- 
ponent yy(t) and a periodic (including dc) component Yper(t) : 


y(t) =, yn (t) oe Ysee(t)s (7) 
The power spectrum of yy(t) is 
Warf) = 2 f" | sq, 0) Par (8) 
where 
sf{,7) = dee e'’'h(t, 7) dt, w = 2rf. (9) 


The periodic component of y(t) is 


Yoor(t) = (a) Dy so(m/T)e? 


(10) 
= pH(a)s,(0) + 2rE(a) Real >> s.(m/T)e?*""”", 
m=) 
where 
1 T 
a) = a fal, 2) dr. (11) 
0 

The de part of y(t) is given by the constant term in equation (10): 

Yar = vE(a)so(0). (12) 


Note that Yper (¢) is zero when H(a) is zero. 

The ensemble average (y’(t)), which gives the [th moment of the 
distribution of the ensemble of y(t)’s at time t, is a periodic function 
of tof period 7. For! =landl=2 


(Wd) = E@ > i: a(t + nT, 2) dr, (13) 


n=—o 


wid) — Wor = BO) [WO +n. 9 


These squations give the first and second cumulants of the distribu- 
tion of the y(t)’s. The lth cumulant at time ¢ is 


x(t) = vE(a’) > [ RG ae (15) 


The periodic and noise components of y(t) are related to the ensemble 
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averages by 
Yoor(t) = (y(t) _ xi (Z), (16) 
yw) = YO) — YOY = wd). (17) 


The mean square value of y,(¢), averaged over time, may be expressed 
in several ways: 


HO =a [ wt) ar= 2 [alo a, 





© 
aC 
oN 
> 
Il 


[Weald af, 
=O Par [alot F, 


2 Ee) ia ae iz dt h*(t, 7). (18) 


All of the foregoing results pertain to the case in which the input 
x(t) is the shot noise (4). Now let the input be zero-mean white 
gaussian noise with the power spectrum 


wry = re pss 
0, lf| > F; 


where F — o. It is shown in Section IX that results for this input can 
be obtained from the preceding shot noise formulas by taking a, = 
+-(N,/v)* with equal probability and letting » > o. Then 


vE(a) > 0, vE(a’) > N., and vE(a’) > 0 for 1 > 2. (20) 


Therefore y,.-(f) = (y()) = 0, and consequently y(t) consists entirely 
of the noise component y(t). Expressions for the output power spectrum 
W,(f) and the mean square values (y’(¢)), y*(¢) are obtained by replacing 
vE(a’) by N, in equations (8), (14), and (18): 


Wit) = ef lsh.) Par, 


(19) 





(y?()) = No = i eG WE 0) de; 





70 = 7. [alsa 


ae 7 “ee is dt W(t, 7). (21) 
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In these expressions, s(f, +) is still the Fourier transform (9) of 
h(t, +). Equations (15) and (20) show that all of the cumulants 
except the second are zero. Therefore the ensemble of y(¢)’s at time 
t is normally distributed about 0 with variance (y?(t)) given by 
equation (21). The probability that y(t) will lie between Y and Y 
+ dY ata time ¢t picked at random is given by expression (113) in 
Section IX. 


IV. RC CIRCUIT OF FIGURE la 


In this section the results stated in Section III for general systems 
will be applied to the RC circuit shown in Fig. la. In this case the 
input x(t) is the input I(t) from the shot-noise current generator, 


I(t) = D3 qo(t — t) (22) 


where the individual charges (of g coulombs) arrive at an average rate 
of vy per second. 


Comparison with the series (4) for 7(¢) shows that a; = q and 


E@=q E@)=¢. (23) 

The output V(t) is constant for intervals of length (1 — a)T while 
the switch is open. When the switch is closed V(¢) drifts either up or 
down, depending upon whether the input current is temporarily greater 
than or less than the leakage through R. The average value of V(t) 
is Va, = vg where vq is the average current flowing through R. It turns 
out that the mean square value of V(¢) — Va. is qVa./(2C). Furthermore, 
the circuit of Fig. la is unusual in that the distribution of the ensemble 
of V(é)’s at time ¢ does not vary periodically with ¢. 

Some insight into the behavior of the system can be obtained by con- 
sidering the case when T/RC < 1. If the switch were closed all of the 
time (a = 1), the usual shot effect formulas would hold and the two- 
sided power spectrum of V(¢) would be 


Wrff) to Way 8Q) 


Vv 








i e'°' F(t) dt 


vg R° 

—_—_ —~~————__4_______ 2 _ 

a 1 + (w RC)’ ai Vac é(f), a) 2rf, 

where F(t) is the V(é) due to a charge gq arriving at time 0; F(t) = 
(q/C) exp (—t/RC) for t > 0, and F(t) = 0 fort < 0. The first term in 
Wv(f) is Wy, (f), the power spectrum of the noise component Vy(t) = 
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V(t) — Va. , and the second term is the spike due to Vz, . Now, instead 
of a = 1 let a be anywhere in 0 < a < 1, but take T/RC < 1. The 
cycles are so brief that V(t) does not change much during one cycle; 
and the situation is much like that for a = 1 except that, in effect, v is 
reduced to va, and F(é) becomes (¢/C) exp (—ta/RC) because the 
condenser current flows only the fraction a of the time. Replacing v 
by va and F(t) by its new expression leads to 

vg R’/a 
1 + ( RC/e) 

When 7'/RC is not small, the expressions for the power spectrum 
become much more complicated. We now turn to the general case in 
which 7'/RC and a are unrestricted except for 0 < a < 1. 

The first step is to determine the response (the condenser voltage) 
h(t, r) at time ¢ to a unit impulse of current arriving at time r where 
0<+7< T. When oT < + < T, the impulse arrives when the switch 
is open, no charge reaches the condenser, no voltage appears across 
the condenser, and hence 


h(t,7) =O forall ¢ when al <7 < T. (24) 


When 0 < 7+ < aT’ the switch is closed, and the unit impulse of 
current arriving at time 7; deposits a unit charge on the condenser. 
This charges the condenser to the voltage 1/C. The voltage decreases 
exponentially as the charge leaks off through R until the switch opens 
at time a7’. The voltage remains constant throughout the interval eT 
< ¢t < T during which the switch is open. It resumes its exponential 
decay during T < ¢t < T + aT’, remains constant during T + aT’ < t 
< 2T, and so on. Hence when 0 < 7 < aT’ the values of A(é, r) are 


Wrv(f) ~ 


0, —-ae <t<7; 
=f . 
C™ exp [—v(t — 7)], 7 <t <ef; (25) 
C™ exp [—y(neT — 7)], n—-U)T+el <t<nT; 
C™ exp [—y(meT — 7) —yi—nT)], nT <t<nT+atl; 
where n = 1, 2,3, -** and 
y = 1/(RC). (26) 
Equation (6) for the output y(t) becomes 


io} 


V@® = aa gh(t, t,) (27) 
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where A(t, t,) can be obtained from the relation h(t + nT,7 + nT) = 
h(t, 7) and the values (24) and (25). From equation (9), the Fourier 
transform s(f, 7) of h(t, 7) is 0 when aT < + < T because, from (24), 
h(t, r) is 0 in the same interval: 
s(f, 7) = 0, al <7r< fT. (28) 
For 0 < + < aT’ we have, from equations (9) and (25), 


as - FOC DIe wae 
aT 
a / C™ exp [v(t — 1) — twl] dt 


os 3 cs exp [—y(naT — 7)} 


n=1 
nT : nT+aT : 
x ([ owas | gianna) (29) 
(n-1)T+aT nT 


- When the integrations are performed, the series summed, and the 
notation 


-iwT 


Sere. eee (30) 
introduced, some algebra carries equation (29) into 


a Cy. C7 1be?” , ( 1 1 
Uae ae Voge Oe es ey) 


for0 << 7r< al. 
The integral (11) for so(f) becomes 








1 T 
Pie ede: weecoee: 
r | (32) 


a 


1 T 
=F : s(f, 7) dr. 
The function so(f) is used solely to compute the periodic portion of 
the output, and therefore only the values of so(m/T), where m is an 
integer, are of interest. For f = m/T the value of wo = 2xf iso = 2rm/T, 
and wT’ = 27m. Evaluation of the integral (82) for so(f) leads to 


s(0) = 1/(Cy) = R, 


(33) 
So(m/T) = 0, mx. 
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As in equation (7), the output V(t) can be expressed as the sum of 
a noise component and a periodic component, 


V@ = Va + Voer(). (34) 


Since s,(m/T) is zero for m # 0, equation (10) shows that for Fig. la, 
the periodic component consists only of the de component: 


Voer(t) = Vac = vli(a)so(0) = vgh. (35) 


The quantity vq is the average shot noise current (in amperes if q is 
measured in coulombs) flowing through R; and Va; is the average IR 
drop across the resistance. 

The value vgf for V,y..(t) can also be obtained from equations (16) 
and (18), 


Vert) = (V)) = mld), 
»E(a) ie 22, haa (36) 


n=-0 


I 


vi(ayC™'/y = vg '/y = vgh, 
where the expressions (24) and (25) for h(t, +) are used in summing 
the series and evaluating the integral. 

The values of the higher order cumulants «;(t) follow almost im- 
mediately from equation (36). First observe that the expression (15) 
for x,(¢) can be obtained from the expression (13) for (y(t)) (= «1(t)) 
by replacing H(a), h(t + nT, 7) by H(a’), h'(t + nT, r), respectively. 
Furthermore, h'(t + nT’, +) can be obtained from h(t + nT, 7) by 
replacing C-+ and y by C~ and ly, respectively. Therefore from equa- 
tion (36), 


mi (t) = vE(a’)C™'/ (ly) = vq'C'/ (hy). (37) 

In particular, the variance of the distribution of the ensemble of 
V (t)’s at time ¢ is 

(VO) — VOY = «(2 = vg’C™/(2y) = a a (38) 


The fact that this does not depend on ¢ shows that the mean square 
value of the fluctuation about V4, is also qVac/(2C): 





VO — VF = V0 = bv =f (39) 





Equation (31) leads to the este! 
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eo 


>. «i(t)(iz)'/U, 


t=1 


I 


In ¢@) 





y za/C ' 
a 2 | (e* — 1) do/o, 


Y Jo 
for the characteristic function ¢(z) of the distribution of the ensemble 
of V(t)’s at time t. The probability density of the distribution is 
given by 


= [ ole '”* dz (41) 


where In ¢(z) can be expressed in terms of sine and cosine integrals. 
The integral (41) also gives the probability density of the value of a 
particular member of the ensemble at a time selected at random. 

The power spectrum Wy,(f) of Vy(t) = V(t) — Vac is obtained by 
substituting the value (31) of s(f, 7) in (8). 


Wea =f fs Par, 


(42) 
Sete fee i (1 — be) — 2) — ‘| 
~The Tome A" 


where » = 2rf, y = 1/RC, and z and b are given by (30): 


-iwT -—yaT 
z=e ; b=e y 
An outline of the evaluation of the integral is given in Appendix A. 
The curves plotted in Fig. 8 were computed from equation (42). It 


can be shown that 
Wry) = vg R*[2 — & + d7T(1 — a)°(1 + 8) — 8)"; 
Wry f) 2 rGRy'a/w, as fro; (43) 
vg R’/a 
1 + [w/(ya)]’’ 
In Fig. 3, the quantity aW,,(f)/(v¢R’) is plotted as a function of 
w/(ya) = w RC/a = 2af RC/a. The parameters are a and 
yT = T/(RC). These coordinates were chosen because the exact com- 


putations made from equation (42) give nearly the same values as does 
the last approximation (7' — 0) in (48) for values of yT less than, say, 


Wry(f) > as T—- 0. 
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5.0. From 





° _1p@ —iwWk 
[ Weal af = 30° = 5S 


it can be shown that the area under any curve in Fig. 3 is 7/2. As 
yI — o, the ordinate at f = 0 ultimately increases as 


a(2 — a) +2 1 — a)" 


which, for yT7' fixed, has a maximum at 


i! 4 


On 88 By" 

The oscillations in the curves for the large values of yT7’ can be cor- 
related with the oscillations in a (sin f/f)? type of spectrum associated 
with the flat portions of length (1 — a)T in V(¢). 

When the shot noise current generator in Fig. la is replaced by a 
zero-mean white noise current generator with a flat, two-sided power 
spectrum W;(f) = Ny, the de component of V(t) becomes 0 and 
V(t) is distributed normally about zero with variance 


(VO) = VO = Nro/Q2yC). (44) 
This V(t) is an example of a stochastic process in which the distribu- 
tion of the ensemble at time ¢ is normal and does not change with f, 
but the process is still not a stationary gaussian process because dV (¢) / 
dt is zero during the intervals that the switch is open. 

The power spectrum Wy(f) is given by equations (42) and (48) 
with the multiplier vg? replaced by Ny. In the particular case in 
which the period 7 is small compared to the time constant RC, the 
last approximation given in equation (48) goes into 


NyoR’/a ; 
1 + @ RC/a)’ 


The Princeton Applied Research notes? obtained by Sell give results 
associated with this approximation. 

By Thevenin’s theorem, the portion of Fig. la consisting of the 
infinite impedance shot noise current generator plus the resistance R 
shunting the generator can be replaced by a zero impedance shot 
noise voltage generator in series with R. The currents and voltages 
in the remaining portion of the circuit are unchanged by this replace- 
ment. The voltage of the new generator is V,(t) = I(t) R; and its 
two-sided power spectrum Wy,(f) is flat and equal to Nyo = 


Wrf) > (45) 
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NyR?. The statistical results for the voltage V(t) across C can be 
expressed in terms of Ny» by replacing Ny by Nyo/R? (that is, vg? by 
Nyo/R?). For example, equation (44) becomes 


V7) = Nyo/(R’2yC’) = Nyo/(2RC). (46) 


V. RC CIRCUIT OF FIGURE 1b 


The input shot noise current I(t) in Fig. 1b is the same as in Fig. 
la, and is given by the sum (22) of impulses of weight g. The switch 
is in position a during the first part of the cycle, nT < t < nT + aT; 
and in position b during the second part, nT + a <t < (N+ 1)T. 

The condenser voltage V(t) increases more often than not during 
the first part of the cycle. It always decreases during the second part. 
Unlike the circuit Fig. la, V(¢) has a periodic portion V,..(t) which 
includes variable terms in addition to Vae. 

Just as in Fig. la, we have a, = q and H(a) = q, E(a*) = q? .The 
response h(t, +) at time ¢ to a unit impulse of current arriving at 
r, where0 <7 < T,is 


0 for —~x <ti<z, 
C exp[—y(t — 7)] for 0<7r<aT and + <1, 


0 for al <7r<T andall ¢. (47) 
As before, y = 1/(RC) and 


ao 


V@ = ee gh(t, t). (48) 


=—0O 


The Fourier transform of h(t, r) is 


s(f, 7) = / | Aiicad Css aE, | 
: (49) 
Z Ca 
y+ tw 
for 0 < + < aT, and s(f, 7) = 0 for a! < + < T. The integral 
So(f) used in computing V,..(t) in V(t) = V(t) + Voer(é) is 


: w = rf; 


ui) =a [90,9 a, 


(1 — e*°**)/[iwT Cy + tw)], (50) 
a/Cy = oR. 


I 


S9(0) 
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The de portion of Vy.,(¢) is 
Va. = vE(a)s.(0) = vgah (51) 
and, from the general expression (10) for Vper(t), 


Vac (4 == c) =| 
Voer(t) = Va. + 2 Real > EE: Ray) ar /° Beg te (52) 


By working with 


Po evay ee > i Ni ah e (53) 


it can be shown that Vpe,(t) increases from A exp (—yZ) at t = 0 
to A exp (—yaT') at ¢ = aT’, and then decreases to A exp (—yT’) at 
t = T and so on, where 


_ Vac e" an ] 


Sr 
a j—e’ 


(54) 


The power spectrum Wyy (f) of the noise portion Vy(t) of V(t) is 
given by equation (8) and the expression (49) for s(f, 7). 


Weal) = | Io, al ae, 








y yf Oe aT d 
= | argh) om ats (55) 
oe oo _ 
eee, RqVa-/[1 + @ RC)']. 
Integrating W,,(f) from f = —*” tof = + shows that the time 
average of Vy(é) is 
V2) = -t @ Vac (56) 


just as in the case of Fig. la ee ae (39) ]. However, in Fig. 
‘la,’ Vac = vgR; whereas in Fig. 1b, Vae = vgeR. 

When the shot noise current generator in Fig. 1b is replaced by a 
zero-mean white noise current generator with flat power spectrum 
Wr(f) = Ni, the periodic component V,..(¢) vanishes and the power 
spectrum of V(¢) is obtained by replacing vq? in equation (55) by Ny: 


—2 


WH Wil) = Nase 


ptt = laf. (57) 
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The time average of V?(t) obtained by integrating equation (57) is 
rye R 
VO = Nw 5a (58) 


Although the periodic component Vper(t) = (V (t)) is zero, the ensemble 
variance (V?(¢)) at time ¢ is a periodic function of ¢. It may be cal- 
culated from the second of equations (21) in which h?(t, +) is ob- 
tained by squaring the expressions (47) for h(t, 7). 


VI. THE ENSEMBLE AVERAGE (y(t)) 


In this section and the two following ones, the arguments used to 
deal with shot noise will be used to determine the power spectrum and 
the moments (more precisely, the cumulants) of the distribution of the 
output y(t) of the periodically varying system shown in Fig. 2. The 
input x(t) is taken to be shot noise consisting of a train of randomly 
arriving impulses. 

Let the system of Fig. 2 start operating at time? = 0 and run tot = T, 
where 7, = NT with N > 1. Let the number of impulses arriving in 
0 <t< T, be the random variable K, and let the input be 


K 
a(t) = >> a,5(t — 4), K21; 
ms (59) 
x(t) = 0, K = 0; 
where, as in equation (4), the impulse amplitudes a; are independent 
random variables with probability density g(a) and expected value 


E(a,) = E(a), Ela) = E(@’). (60) 
The arrival times ¢,, te, -:: t, are independent random variables with 
Prob [é < & < t+ dt] = dt/T,. (61) 


The number of arrivals K has the Poisson distribution 
Prob [K = L] = (T,)*e’™/LI, 
E(K) = sT, , (62) 
E(K*? — K) = OT)’, 
where y» is the expected number of arrivals per second. 
The output produced by the input (59) is 
vi) = Labi), Kz; 


y(t) = 0, K=0. 


(63) 
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When ¢ is fixed, y(t) may be regarded as a random variable since 
it depends on the random variables K, ax, t,. The [th moment of the 
distribution of y(t) is the ensemble average (y’(t)). Usually (y'(t)) 
will depend on ¢ and be periodic with period 7. We shall be con- 
cerned with the first moment (y(t)) in the remainder of this section. 

When the right side of the first part of equation (63) is averaged 
over the ensemble of a,’s, it becomes 


E(a) > Mid RE, (64) 


Averaging this over the ensemble of ¢;’s gives 
K 1 Ti 1 T1 
Fa) +> / di, h(t, ) = KE() + i: dt, Wt, t,) (65) 
k=l T, 0 Jue 0 


where use has been made of the fact that all of the terms in the 
series on the left are equal. Finally, averaging over the ensemble of 
K’s with the help of H(K) = vT, gives 


T1 
(y() = wea) J at We, b). (66) 
Dividing the interval (0, 71) into N equal intervals of length 7, 


_ setting ¢, = nT’ + +, and using the periodic property h(t + nT, 7 + 
nT) = h(t, 7) leads to 


(y(d)) 


I 


N-1 (n+1)7 
rE(a) > / dt, h(t, t), 
n= nT 


I 


»E(a) » [ dr Kt ~ nT +n nT + 2), (67) 


I 


N-1 T 
vE(a) SS i dr htt at. >). 
n=0 0 


Equation (67) holds when the system starts operating at t¢ = 0 
and stops at t = T,. The following heuristic argument suggests that 
when (7) the system runs from t = —o to +o, and (a) h(t, r) is 
such that only recent arrivals are of importance in determining the 
present state of the system, the analogue of equation (67) is 


(y()) = vE(@) = [ rent Sy, (68) 


We assume that, for 0 < 7 < T, h(u, r) becomes negligible when 
|u| = mT’ where m is a small integer. We define ¢ to be in the “in- 
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terior” of (0, 71) when 


mT <t<T, — mT. 
If ¢ is in the interior of (0, 71), the summation in equation (67) 
can be written as 


N-1 © 


n=0 |t—nT|<mT n=—-@ 
because h(t — nT’, r) is negligible except when |¢ — nJ'| < mT. Hence 
when ¢ is in the interior of (0, 7,) and the system runs from 0 to Ti 
< y(t) > is given by both (67) and (68). 

In the interior of (0, 71) the starting and stopping transients near 
0 and JT; have died out, and y(t) is the same irrespective of whether 
the system runs from 0 to 7; or from —e to +. Hence when ¢ is in 
the interior of (0, 71) and the system runs from — to +, (y(t)) 
is again given by both (67) and (68). 

The right side of equation (68) is a periodic function of t of period 
T. Physical considerations suggest that when the system runs from 
—o to +00 (y(t)) is also a periodic function of period T. Since (y(t)) 
and the right side of equation (68) are equal when ¢ lies in the in- 
terior of (0, T,) (which extends over more than one period), it is 
plausible to say that the equality holds for all values of t. This is 
what we wished to show. 

Equation (68) appears as equation (13) in Section III. The sign 
of the index of summation n has been changed to make it easier to 
apply the formula. 


VII. THE CUMULANTS FoR y(t) 


The lth moment (y’(t)) may be expressed in terms of the first 1 


cumulants «i(t), «+: «:(t) of the distribution and conversely. For 
L=1landl=2. . 
al) = (y(d), oe 
Ko(t) = {y°()) — (yd). 
The cumulants are defined by 
In g@) = >> «,()(i2)'/1! (70) 


t=1 


where 9(z) is the characteristic function 
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g(z) = (exp [izy())). (71) 


The method of averaging over the ensemble used in the preceding 
section to obtain (y(¢)) will now be applied to calculate (exp [izy(t) ]). 
We have, because of the independence of the a;’s and ¢,’s, 


(eo [a Faa.10]). 


= p» a E e”™ exp [iza,h(t, t,)])*, (72) 


exp [—vT + »T (exp [rza,h(t, t,)])]. 


Therefore, upon using the definition (71) of y(z) and the probability 
densities of a; and tg, 


l 


(exp [izy(d)]) 





I 


© T1 
In gz) = —»-T, + rT, / da, q(dx) i a exp [7za,h(t, t,)]. (73) 
-—o 0 1 
Expanding both sides in powers of zg and equating coefficients of 


(vz) "/11, 


[oy T1 
k(t) = » | dai, a(as)ai [ dt, hi(t, ti), (74) 
eae 74 


T 1 
~ »E(a") i dt, h(t, t,). 
0 


When J = 1, equation (74) reduces to equation (66) for (y(t)). 
The steps that lead from equation (66) to the final expression for 
(y(t)) carry (74) into 


© T 
c() = Ea) Sf det — a0, 2). (75) 
n=—o v0 
This appears as equation (15) in Section III with n replaced by —n. 


VIII. THE POWER SPECTRUM OF y(t) 


When A(t, 7) is such that y(t) has the two-sided power spectrum 
W, (f), it is given by* 


Wy) = limit (| SG, 71) F/T. (76) 


where 
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Sf, 71) SG,7, 3; Kya: ,°°* , Ge 5h y+: , tx) 


Ty 


dt e™**'y(t), w = 2rf; 


I 


I 


(77) 


| 


0 

K Ty, : 

ye a, | aier Wb te)s 
k= 0 


1 


K Ti-tk ; 
=r if due" R GE, eu, t). 
k=1 = 


tk 

In the derivation of equation (68) for (y(t)), the limits of sum- 
mation n = 0, n = N—1 were replaced by n = —%,n = ©. In much 
the same way, we assume that in (77) the limits of integration —¢,, 
T, — t, can be replaced by —o, +0 in all but a negligible fraction 
of the terms (those with ¢, near 0 or 7,). This presupposes a suf- 
ficiently rapid decrease in the value of |h(t, + u, tx)| as jul > 0. 
Heuristically, we picture h(t, + u, t,) as being negligible except when 
wu is small. When 7’; is very large, most of the ¢,’s and (T, — t,)’s 
will be large. Consequently, for most of the #,’s, h(t, + u, t,) will be 
negligible when wu is less than —¢#; or greater than T — ty. 

This assumption allows us to replace equations (76) and (77) by 


Wf) = limit (| SG, Ts) |)/T: (78) 
and 
Say, Fi = Ya f due?" h(t, + u, t), 
- (79) 
= ds as(f, i), 
where 
rie / ” dte™'h(t, 2). (80) 
From equation (79) 
|S.0, 7.) F = SG, TSH, TD), es 


=> > a,as(f, t)s*(f, tr), 


k=1 
where the star denotes conjugate complex. The terms in equation (81) 
can be divided into two types. For Type J, | = k, and for Type II, 
| 4 k. It is convenient to take their ensemble averages separately. 
The typical Type I term is 
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a, | s(f, te) |”. (82) 
There are K terms of Type I in the double sum (81), and all of them 
are of the form (82). Therefore, when use is made of E(K) = v7, 
the contribution of the Type I terms to (|S. (f, 71) |*) is found to be 


T, 
vE(a’) | dr | sf, 7) [°. (83) 
0 
The typical Type II term in (86) is 
awash, i)s*f, i), lA. 
When averaged with respect to a;,, a, t,, t it becomes 


T1 dt, 
iE — t 
(a) 7 p. s(f ’ >) 


There are K* — K terms of Type II in the double sum (81) and all of 
them have the average value (84). Therefore, when use is made of 
E(K? — K) = v°T? , the contribution of Type II terms to { | S,(f, 7) |”) 
is found to be 


2 


(84) 








2 


(85) 








vba) | ” dr s(f, 7) 


Adding the contributions of Type I and Type II, and inserting the 
resulting expression for ({S,(f, T1)|*) in equation (78) for the power 
spectrum gives, with » = 2zf and s(f, r) given by (80), 


Wa) = limit ® [ama [ar |G F 








+ | Ea) [ tas i (86) 


provided s(f, r) [i.e., h(é, +)] is such that the limit exists. If, for 
certain frequencies, the function of 7, following the limit sign ulti- 
mately increases linearly with T,, W,(f) has an infinite spike at these 
frequencies. This means that y(t) has sinusoidal components at these 
frequencies. 

So far in this section, the time variation of the system has not 
been assumed to be periodic. Now we apply (86) to the case in which 
the system varies periodically with period T and, in accordance with 
equations (3) and (80), 


hé@—nT + n7T,7+ nT) = hit — nT, 7), 
s(f, 7+ nT) = e7**"*s(f, 7). 


(87) 
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In (86) set 7; = NT and let N ~ o. Then 


af aioba=af ariegoe, 
[Oo arsG, = 7 De trecl), (88) 


1 — eg i@NT 


= Ts)(f) Toe? 


where 
1 T 
sf) = af drs(f,2),  o = nf. (89) 
The contribution of the second term in (86) contains the factor 
sin oNT 
og Dare te Cas 2 
Hay got | ON of 
sin = 
(90) 
= m 
gre lf ~ 7) 


where the last step follows from the relations used in the proof of 
Fejér’s theorem in the theory of Fourier series. When these results 
are used in equation (86), it goes into 


WN = Ba) Ff ar log. 7) P 


(91) 








+o & alt - 4) [ol5)| 


Equation (91) shows spikes in W,(f) at f = m/T where m is an 
integer. The spike at f = 0 corresponds to the de component Ya, of 
y(t), and the spikes at +m/T to the sinusoidal component 


A, cos [2rm(t/T) + On] (92) 


in y(t). The expression (91) for W,(f) shows that the (time) average 
powers in these components are 


Yao = [vE(a)s(0)]’, 
2A, = PE(a)]’[| s0(—m/T) |? + | so(m/T) 1, (93) 
2[rli(a) | so(m/T) |). 


I 
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Equations (93) tell us nothing about the sign of yae or about the 
phase angle 6,,. One of the several ways to get this information is 
to imagine y(t) expanded in a Fourier series of long period 11, 


yd) = Den exp (2mnt/P,), 


I 


T1 
Cr [ dt y(t) exp (—72rnt/T;), 


-s(@.7), (94) 
1 s(n) 


-i> (2 t) 
Ve co (das ie 


1 


I 


Ui 


eR a 


Re 


where we have used equations (77) and (79) for S(f, 7.) and its ap- 
proximation S,(f, 71). The expression (91) for W,(f) shows that the 
¢,s maay be divided into two classes; those corresponding to the fre- 
quencies n/T, = m/T, i.e., n = mN (discrete sinusoidal components) 
and those corresponding to n ~ mN (noise). For the first class, ¢, is 
0(1) and nearly the same for most y(¢)’s of the ensemble. For the second 
class, c, is O(T;?) and varies greatly from member to member. 

To obtain the discrete sinusoidal component in y(t) of frequency 
m/T, we set n = mN in equation (94) and apply the procedure used 
in Section VI (to obtain (y(é))) to average c, over the ensemble. 


(deny = F BOOB) [” SB, n) 


vE\(a)T 3 r exp [enti], : (95) 


n= 


= vli(a)so(m/T), 


where we have used equations (88) and (89) with o = 27m/T. We 
therefore write y(t) as the sum of a noise component yy(t), con- 
sisting of the sum of terms of the second class, and a periodic com- 
ponent Yper(t), consisting of the sum of terms of the first class: 


yt) = yn@ + Yoor(é). (96) 
The power spectrum of yy(t) is the first term in the expression (91) 
for W,(t): 
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Weyl) = oBCa) a [dr (3G, 2) V (97) 


The periodic component is, from (95), 


Yoor(t) = vli(a) 2) So(m/T) exp [12rmt/T)). (98) 


The parts yy(t) and Ypor(t) of y(t) are related to the ensemble 
averages by 


Yoor(t) = (y(t)) = m(d), (99) 


(yw) = (y?()) — Y@) = xe(Z). (100) 


Equation (99) can be proved by showing that the mth Fourier coef- 
ficients of Ypor(t) and (y(t)) are equal for all integers m, ie., by 
showing that 


vli(a)so(m/T) = z E (y(t)) exp (—172rmt/T) dt. (101) 


When the series (68) for (y(t)) is substituted on the right, the summation 
and the integration with respect to t from 0 to T combine to give an 
integral in ¢ with limits +. This integral can be evaluated with the 
help of the integral (80) for s(f, 7) and leads to the verification of (99). 
Equation (100) follows from the ensemble average of the square of 


yv(t) = yt) — Yer) = yO) — (y@). 


Setting 1 = 2 in the expression (75) for x,(t) and using (100) gives an 
expression for the ensemble average of y,(t) at time ¢, 


(yn(t)) = x(t) = vE(a’) ps i h(t — nT’, r) dr. (102) 


It follows from (102) that when the variance (y,(é)) varies with ¢, it 
varies periodically with period 7. When equation (102) is averaged over 
a period and use is made of the ergodic relation (2), we get the time aver- 
age 


TO = GO) =" [ar [ arwee,n. (108) 


From the expression (97) for W,,(f), we get a second expression for 


yn() 


WO = [Wot at =" far fat ieg. nF. oy 
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The equality of (103) and (104) can also be proved directly by using 
the Fourier integral (80) relating s(f, 7) and h(t, 7). 


IX. WHITE GAUSSIAN NOISE INPUT 


Let the input x(t) of the periodically varying system shown in 
Tig. 2 be white gaussian noise with zero mean. Here we show that 
the output y(t) has no de or sinusoidal components, and that the 
power spectrum of y(é)is 


wi) = Ff isG.9 Par (1055) 


where the power spectrum of x(t) is W.(f) = No for |f| < F and 
W.(f) = 0 for |f| > F with F > o. 

Consider Fig. 4 in which an ideal low pass filter which passes only 
the frequencies |f| < F has been inserted between the input and the 
periodically varying network specified by h(t, 7). 

When x(t) is a unit impulse applied at time 7, x(t) = 8(t — 7), the 
filter output at time t = ¢, is 


sin 2rF(t, — 7) 


2(t,) 7 a(t, =o 7) ’ (106) 
and the system output at time ¢ is 
y(t) = i, CR are i 2 sa Dai, (107) 


Thus, when h(é, t,) satisfies conditions associated with the Fourier 
integral theorem, y(t) tends to A(t, 7) as F — ©; a result which follows 
immediately from physical considerations. 

Take x(t) to be the shot noise given by (4) in which, for given values 
of N, and v, a, = -&(N,/v)* with equal probability. Then 


vE(a) = vE(a,) = 0, 





(108) 
vE(a’) = vE(a,) = No, 
and the filter output is the zero-mean shot noise 
= — sin 2rF (t man tr) 
ed) = DD Ay aa eer =i) (109) 
with the power spectrum 
Wi) = * sin sme tot ail eDig 
(110) 
0, lf| > F. 
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Fig. 4—Conversion of shot noise x(t) to white noise z(t). 


Now hold F fixed and let » > o. The individual pulses comprising 
z(t) become smaller and smaller, and overlap more and more. In the 
limit z(t) becomes zero-mean gaussian noise with the power spectrum 
(110). 

Finally, let F > o. Then z(t) becomes white gaussian noise with 
the flat power spectrum W,(f) = No. According to equation (107), the 
response of the Fig. 4 system at time ¢ to a unit impulse applied at 
time 7 tends to h(t, r) as F — o. Therefore, the results obtained in 
Sections VI, VII, and VIII for shot noise input in Fig. 2 are carried 
into corresponding results for white noise input (i.e., x(t) in Fig. 2 
is white gaussian noise) by the substitutions (108), namely vH (a) 
= 0 and pH (a?) = No. 

Setting vZ (a) = 0 in equation (98) for Ypor(t) shows that Yper(t) is 
zero for zero-mean white noise input. Consequently, y(t) contains no 
de or sinusoidal components. 

Setting vi (a) = 0, and vE (a?) = No in equation (91) for W,(f) 
shows that the power spectrum of y(t) is given by equation (105) 
when the input is white gaussian noise. Furthermore, y(t) is composed 
entirely of yy(¢); and equations (102), (103), and (104) become 


(y(t) = No p> [ h(t — nT, 7) dr, (111) 





T © 
PO=p arf awe, 


= far [ap lod, oF. 


The fraction of time any particular member of the ensemble of outputs 
spends in the infinitesimal interval Y < y(t) < Y + dY is 


(112) 


~ [ dt[2n(y*(t))|* exp [— Y?/(2ty"(O))] (113) 


where (y?(¢)) is the function of ¢ defined by equation (111). 
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APPENDIX A 


The Power Spectrum for Figure la 
Here we give some of the steps leading from the first line to the second 
line of equation (42) for Wy,(f). The first line is 
- 2 aT 
Wee) =] is. DP ar (114) 
where s(f, 7) is given by equation (31). Multiplying (31) by its com- 
plex conjugate gives 
= 2 Che 
A aah te ee, OM eS 
Is(f, 7)| a “” -. eg ee \(y + tw)e|” 
“2, yTtior — 9% 
Cube" *"(@ — 2°)(—9) ( aie 


2 
z2—2° 


1 — bz 








Tee E Te) (1 — bay + ta) (ie) 











Then 
aT Co Gees 2 1)y ee gH 2 
2 = Ne | | 
[ Is(f, 7) |? dr = a [ar + Oe Tate 


+ 2 Real ye De 29) : (116) 


(1 — bz)(y + tw) (tw) 
Upon introducing the values b = exp (—yaT'), 2 = exp (—iw7’), and 
using the identity 





2 - 
aq — ey (ZZ) TS Gt = OE = 2") 
4(1 — 0°) |} ] = —Real aa (117) 








the quantity within the square brackets in equation (116) becomes 


(l — be) — 2) vv — to) 
(1 — bake + ta) aS 


and thus leads to the expression of Wy,(f) given by the second line of 
equation (42). 


aT + Real 
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A New Approach for Evaluating the Error 
Probability in the Presence of Intersymbol 
Interference and Additive Gaussian Noise 


By E. Y. HO and Y. S. YEH 
(Manuscript received June 25, 1970) 


The determination of the error probability of a data transmission system 
in the presence of intersymbol interference and additive gaussian noise is 
a major goal in the analysts of such systems. The exhaustive method for 
jinding the error probability calculates all the possible states of the received 
signal using an N-sample approximation of the true channel impulse 
response. This method ts too time-consuming because the computation 
involved grows exponentially with N. The worst-case sequence bound 
avoids the lengthy computation problem but is generally too loose. 

In this paper, we have developed a new method* which yields the error 
probability in terms of the first 2k moments of the intersymbol interference. 
A recurrence relation for the moments ts derived. Therefore, a good approxi- 
mation to the error probability of the true channel can be obtained by 
choosing N large enough, and the amount of computation involved increases 
only linearly with N. The series expansion is shown to be absolutely 
convergent, and an upper bound on the series truncation error is given. 
In order to show the improvement provided in this new method, it 1s com- 
pared with the Chernoff bound technique in three representative cases. 
An order of magnitude improvement in accuracy is obtained. 


I. INTRODUCTION 


An important problem in the analysis of binary digital data sys- 
tems is the determination of the system performance in the presence 
of intersymbol interference and additive gaussian noise. Since it is 
usually the most meaningful criterion in designing a digital data 


*In April 1970, the authors were advised by R. W. Pulleyblank that a similar 
method was discovered independently by M. Celebiler and O. Shimbo to be 
presented in a paper which will be published in Conference Record, ICC, 1970. 
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system, the error probability is chosen as the measure of the system 
performance. 

Two alternatives are available at present. The first alternative’? 
considers a truncated N-pulse-train approximation of the true channel. 
The error probability is calculated by evaluating the conditional error 
probability of each of 2” possible data sequences and averaging over 
all 2” sequences. Since each calculation of the conditional error prob- 
ability takes a great deal of computer time, the number of sequences 
must be held to several thousand.’ This limitation leads to a poor 
approximation of the true channel, and the error probability so ob- 
tained is not very useful. The second alternative evaluates an upper 
bound of the error probability by either the worst-case sequence® or 
the Chernoff inequality.*> In many cases, the bound is too loose. 

In this study we have developed a new way to evaluate the error 
probability in terms of the first 2k moments of the intersymbol inter- 
ference. It provides a significant improvement in accuracy over the 
worst-case sequence bound or the Chernoff bound. The computations 
increase only linearly with N. Thus a good approximation of the true 
channel may be obtained. The convergence of this alternative is proved. 
Throughout, additive gaussian noise and independence of information 
digits are assumed. The generalization to a multilevel system is 
straightforward; hence, only binary systems will be considered in 
this study. 


II. BRIEF DESCRIPTION OF THE SYSTEM 


A simplified block diagram of a binary amplitude modulation (AM) 
data system is shown in Fig. 1. We assume that a single s(t) having 
amplitude a, is transmitted through the channel every 7 seconds. The 
system transfer function is 


Ro) = S@)T@)E@) (1) 


where s(¢) and r(é) are the Fourier transform pair of S(w) and R(w), 
respectively. In the absence of channel noise, a sequence of input 
channel signals 


co 


> as(t — £7), (2) 


f=—00 


will generate a corresponding output sequence 


eo 


» agr(t — 47), | (3) 


f=a—2 


where {a,} is a sequence of independent binary random variables, 
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Fig. 1—Simplified block diagram of a binary AM data system. 


a, = +1, and satisfies 
P,(a, = 1) = Pa, = —1) = 3 
= coy ee. 5 1, Oy 1 0. (4) 


We also assume that additive gaussian noise is present in the system. 
Thus the corrupted received sequence at the input to the receiver 
detector is 


ui) = Do awl = 1) +n, (5) 


where n(t) 1s additive gaussian noise with a one-sided power spectral 
density of o” watts/eps. 

At the detector, y(t) is sampled every 7 seconds to determine the 
transmitted signal. At sampling instant ¢) , the sampled signal is 


ute) = aor(le) + DY aar(ly — EF) + n(h), (6) 


£740 


The first term is the desired signal while the second and the third 
terms represent the intersymbol interference and gaussian noise re- 
spectively. 

It is well known that the optimum (minimum error probability) 
decision level is zero. Thus the error probability is given by 


P,= PL 3 ag(é — €T) + no) > rah (7) 


f=—0 
£40 


For the real system we are interested in, we may assume that the 


ar(to — £T)’s are uniformly bounded and yh a(t) — €T) converges 
absolutely.* For example, in a system having an open binary eye, 


* Finite truncated pulse-train approximation will be used for those pulses with 
absolutely divergent intersymbol interference, 
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dirwo | 7(to — £T) | is less than r(t.). Thus by Kolmogorov’s Three-Series 
criterion, it can be easily shown that eh ar(ts — £1) converges 
absolutely to a random variable. 

Equation (7) can be calculated by evaluating the expected value of the 
conditional expectation of the error probability for a given random 
variable eee a(t) — €T); therefore, 


ee eel exp [—{y — rt) — X}*/20?] dy dF@), ©) 


where F(X) is the distribution function of the random variable X, and 
x= ae ayr(to = LP). 


III. SERIES EXPANSION OF Pe 


With the exception of a few special cases, equation (8) is generally 
difficult to solve. The existing solutions are either too time-consum- 
ing?” or inaccurate.?:+5 

We have found that equation (8) can be evaluated in terms of an 
absolutely convergent series involving moments of the intersymbol 
interference. Furthermore, the moments can be obtained readily 
through recurrence relations. Therefore, the computation time is sig- 
nificantly reduced in comparison with the exhaustive method.*? The 
absolute convergence and the recurrence relations for the moments are 
given in Appendix A and B respectively. 

Expanding equation (8), we obtain the following expression for the 
error probability, 


J exp a (SE) | ta.) k= 1,2,3,+--, 


a a ree (9) 





P, 





where H»;_1(v) is a Hermite polynominal, Mz, is the 2kth moment of 
the random variable X, and 


—2z 


erfe (—2) = vA exp (—2’) dz. (10) 


—oO 


The first term in equation (9) represents the nominal system error 
probability due to additive gaussian noise alone while the summation 
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represents the degradation of the system performance due to intersym- 
bol interference in the additive gaussian noise environment. 


3.1 Convergence Property 

In Appendix A we have shown that equation (9) is an absolutely 
convergent series. Therefore, the error probability can be evaluated 
by taking a finite number of terms, 


K-1 
= 2 Pew + Rox , (11) 
where Rex represents the truncation error and is upper bounded by 
a (2K — 3)! 1 
Rae = 2a Poss = "ORK! V4K — 2- IG = 2k 7 


[esp — (GD) [ome rte — ey ir 
. P (2 ine =e) y 
ja | o ] 
= Uox . (12)* 


Thus for a given truncation error bound, «, we may always find a 
positive integer, K, such that 








User <s €. (13) 


For a real system, the truncation error is generally much smaller 
than e. Therefore, fewer terms are needed in evaluating the error prob- 
ability. 


3.2 Hvaluation of Moments 


The series expansion of equation (9) can be readily evaluated if we 
can determine the moment, Mo2,. The Mo;’s are given by 


a / x ar). (14) 


To evaluate Mz; according to equation (14) requires the knowledge of 
dF (X); this is just as difficult to obtain as the evaluation of the error 
probability given by equation (8). However, we have found it possible 
to obtain a recurrence relation for M>, by examining the first deriva- 


* (2K — 3)!! = QK —3)-(2K — 5) --- 3-1 
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tive of the characteristic function. The recurrence formula makes the 
series expansion approach feasible, and is derived in Appendix B: 


Mn = {> eS a 1) 1) Maa-of* PO} ’ (15) 

where 
M,=1 (16) 
jong) = CD a | De — eM a) 


and Bza,’s are Bernoulli numbers. 


3.3 Truncated Pulse-Train Approximation 


For any real binary system, the message must be time-limited to a 
finite number of symbol durations, or we may even assume that 7(t) is 
time-limited to, say, N symbol durations. Thus the error probability 
may be calculated by evaluating the conditional error probability for 
each of 2" possible data sequences and then averaging over all 2% 
sequences. Since the number of possible data sequences grows exponen- 
tially with N, it would be impractical to evaluate the error probability 
by this straightforward method even with a digital computer. Hence, V 
must be confined to a small number; the error probability so obtained 
could at best be a poor approximation of the true error probability. 
However, in equation (9), the amount of computation involved grows 
only linearly with NV. Therefore, the pulse train can be truncated at 
any desired point to assure a good approximation of the true channel. 


IV. APPLICATIONS 


The error probabilities for certain cases are calculated by equation 
(9) to determine the accuracy and the convergence of this new method. 


41 Case 1: Data Set 203’ 


A 2400-baud DDD option of the Data Set 203 operating over a 
channel having symmetrical parabolic delay distortion, as shown in 
Fig. 2, is considered in this case. The group delays at the carrier and 
the lower 3-dB frequencies are 0.6 ms relative to the center of the 
signal spectrum. The channel we considered is worse than a worst- 
case-C2 line. A 5-tap mean-square equalizer is used by the receiver 
to equalize the channel. A truncated 34-pulse-train approximation (19 
samples after and 15 samples before the sampling instant ¢)) for the 
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Fig. 2—Channel group-delay-frequency response. 


equalized output impulse response was used. The equalized binary 
eye is about 70 percent open in this case. The input signal-to-noise 
ratio is 14 dB. The error probabilities at the equalizer output evaluated 
by equation (9) and the Chernoff inequality are shown in Fig. 3. 
Curve (a) is the Chernoff bound. Curve (b) is the error probability 
evaluated by taking a finite number of terms in equation (9). Curve 
(c) is the truncation error bound given by equation (12). It can be 
seen that taking the first nine terms in equation (9) assures less than 
one percent truncation error in evaluating the error probability. In 
this case, however, the actual series converges after only four terms. An 
improvement in accuracy by a factor of 15 is realized by this series 
expansion method compared to that obtained by Chernoff inequality. 


4.2 Case 2: Ideal Channel and Ideal Band-Limited Pulse 
The received pulse is assumed to have the form, 


sin rt/T 
riz ae 


The signal-to-noise ratio at the nominal sampling instant is taken 
to be 16 dB. In the absence of intersymbol interference, the system 
error probability is 107°. For a truncated 11-pulse-train approxima- 
tion, the exact error probabilities and the error probabilities evaluated 
by taking a finite number of terms in equation (9) for different values 
of sampling instant and number of terms and equation (12) are shown 
in Figs. 4-5. It can be seen from these figures that the series con- 
verges more rapidly for smaller values of the quantity 


(18) 


g(to) = (> | r(t) — £T) i/o) 


[e.g., in this case, g(0.057) = 1.96]. 
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Fig. 3—Comparison of error probabilities obtained by Chernoff bound and series 
expansion method. (S/N) input = 14 dB; data set 203 (2400-Baud Option); 5-tap 
mean square equalizer; parabolic delay distortion channel (see Fig. 2). 


The series starts to oscillate when q(t ) is not small [e.g., ¢(0.27') 
= 30.8]. At t = 0.27, the series did not converge well for the first 
eight terms in equation (9). However, it will converge to the exact 
value eventually. The error probabilities obtained by Chernoff 
bound,® exact calculation, and equation (9) are shown in Fig. 6. It is 
clear that this new alternative provides a significant improvement over 
the Chernoff bound. 


4.3 Case 8: Ideal Channel and Fourth-Order Chebyshev Pulse° 


In this case, a fourth-order Chebyshev filter is used. The received 
pulse is 
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rt) = A, cos (w, | ¢|/T — #,)-exp [—a | ¢ |/T] 
+ Az COs (we | t |/T — &,)-exp [—ay | t |/7), (19) 


with 
A, = 0.4023, A, = 0.7163, 
w, = 2.839, w, = 1.176, 
®, = 0.7558, &, = 0.1602, 


= 0.4587, a. = 1.107. 


The signal-to-noise ratio at the nominal sampling instant is taken to be 
16 dB. For a truncated 11-pulse-train approximation, the exact error 
probabilities and the error probabilities obtained by taking a finite 
number of terms in equation (9) for various sampling instants and 
numbers of terms are shown in Figs. 7-8. The error probabilities ob- 
tained by the Chernoff* bound, the exact calculation, and equation (9) 
are shown in Fig. 9. The same results as in case 2 are observed. 


V. SUMMARY AND CONCLUSIONS 


In this study we have developed a new method of evaluating the 
error probability for synchronous data systems in the presence of 
intersymbol interference and additive gaussian noise under the fol- 


SERIES EXPANSION 





Oo 1 2 3 4 5 6 7 8 9 10 ff 
NUMBER OF TERMS, k 


Vig. 4—Error probabilities versus number of terms in equation (9). Ideal band- 
limited signal. 11-pulse truncation approximation; sampling instant, t = 0.05 7; 


(S/N) = 16 dB. 
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SERIES EXPANSION 





o 7% 2 3 4 5 6 7 8 9 0 11 
NUMBER OF TERMS, k 


; Fig. 5—Error probabilities versus number of terms in equation (9). Ideal band- 
limited signal. 11-pulse truncation approximation; sampling instant, ¢ = 02 T; 
(S/N) = 16 GB. 


lowing assumptions. First, the information digits are identically and 
independently distributed. Second, the intersymbol interference con- 
verges absolutely. (For those pulses with absolutely divergent inter- 
symbol interference, only finite truncated approximation of the real 
pulse will be used.) Three cases, which are representative of practical 
situations, are considered. The results show that this new method has 
a significant improvement in accuracy over Chernoff bound. For exam- 
ple, we consider the 2400-baud DDD option of the Data Set 203 
operating over a channel having symmetrical delay distortion in excess 
of that of a worst-case C-2 line. A 5-tap mean-square equalizer is used 
by the receiver to equalize the channel. With a 14-dB input signal-to- 
noise ratio, the series expansion method provides a factor of 15 im- 
provement over the Chernoff bound in estimating the error probability 
at the equalizer output. 

The absolute convergence of the series expansion method is proved 
in Appendix A. An estimate of the terms required to reach the neigh- 
borhood of the true error probability is provided by equation (12). In 
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ie) O.05T O10T O15T 0.20T 0.25T 
SAMPLING TIME, DEVIATION 
FROM NOMINAL VALUE 


Fig. 6—Comparison of error probabilities obtained by Chernoff bound, exhaus- 
tive method, and series expansion method. Ideal band-limited signal (S/N) = 
16 dB. [-—-Chernoff bound, —_ exhaustive method (11-pulse truncation), 
000 series expansion (8-terms).] 
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) i 2 3 4 5 6 7 8 9 10 4 
NUMBER OF TERMS, k 


Fig. 7—Error probability versus number of terms in equation (9). Fourth-order 
Chebyshev pulse, 11-pulse truncation approximation; sampling instant, ¢ = 0.05 
T;(S/N) = 16 dB. 





a(t) = Ai COS (a |t//7’ — $1) EXP (—a, |t|/7’) 
+ Az COS (we !t]/T’ — 62) EXP (—ae |t|/T). 

A; = 0.4028, 2 = 0.7163, 

w, = 2.839, we = 1.176, 

¢é1 = 0.7553, go = 0.1602, 

a, = 0.4587, a2 = 1.107. 
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SERIES EXPANSION 





oO 1 2 3 4 5 6 7 8 98 10 If 
NUMBER OF TERMS, kK 


Fig. 8—Error probabilities versus number of terms in equation (9). Fourth-order 
Chebyshev pulse; 11-pulse truncation approximation; sampling instant, t = 
0.2 7; (S/N) = 16 dB. 


actual systems, however, the true value is usually reached with only 
a small number of expansion terms. For example, in Fig. 3, the trunca- 
tion error is less than 2 X 10-® after taking into account the 9th term 
of the series expansion (which involves the 18th moment of the inter- 
symbol interference); practically speaking, however, only three or 
four terms would be required for the series to converge in this example. 
In all the examples we considered, it is observed that a small error 
is assured by taking into account the first ten terms of the series. 

The convergence is somewhat slower if the ratio of intersymbol inter- 
ference to noise power [q(to)] is large (see Section IV, case 2.), as 
indicated in Figs. 5 and 8. Under this condition, either the intersymbol 
interference is so bad that the system is not of practical interest, or the 
input signal-to-noise ratio is so high that the Chernoff bound already 
assures that the system performance is acceptable. For both cases, 
there is no need to evaluate the error probability. 
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For computation purposes every system must be approximated by a 
finite-memory-system. Since the computations involved in this new 
method increase only linearly with the length of the memory, a good 
approximation of the true channel may be obtained without excessive 
computation. 


APPENDIX A 


Convergence of the Series Expansion Method 
In this Appendix, we shall prove that equation (9) is an absolutely 
convergent series. We know that 


Mn = / X™ dF (x) 
all xXx 


</ (uw X*aF@, 
all X 


l 


{>> | ro — £2) |}”* (20) 


440 


LOG 9 Pe 





ie) 0.05T OJ0T O45T 0.20T 0.25T 


SAMPLING TIME, DEVIATION 
FROM NOMINAL VALUE 


Fig. 9—Comparison of error probabilities obtained by Chernoff bound, ex- 
haustive method, and series expansion method. Fourth-order Chebyshev pulse, 
(S/N) = 16 dB. [-~-Chernoff bound, —____ exhaustive method (11-pulse 
truncation), 000 series expansion (8-terms).] 
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and 
Hoxss(t) = (—1)*2" (2K — INV 2K +1 
-exp (27/2) - | sin (V4k +32) + (+. Wk )]. (21)* 





Hence, 
Riek l= eae /2K —1 Fr (3) + 
“exp | -(G@) | {27 | rte — 42) 13," 
= Sox . (22) 


The ratio of Sox 42 to Sox is given by 
2 
Sex (2K +2)V2K +1 ; 


For K sufficiently large, equation (23) is always less than unity. There- 
fore equation (9) is an absolutely convergent series. 


(23) 





APPENDIX B 


Derivation of the Recurrence Relations for the Moment of Intersymbol 
Interference 


It has been shown that the intersymbol interference converges 
absolutely to a random variable’ X. The characteristic function of the 
random variable X is given by, 


Bw) 


I 


/ e'°* dF(X), 
all X 


* \2 “\k 
=1+joM,+ 2 y+. +S mee. 8 


Therefore, we obtain 


* See Ref. 8. 
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6(0) = 1 
TRO) | = 0) = —e, 
(25) 
£0) _ a) = (-)'Ma, 





Since a,’s are identically and independently distributed random 
variables and with zero mean, 


M,=M,=:-:> = Mas = --: = 0 for b= 0, ly 2 ess (26) 


and 


Pw) = II cos wr(ty — tT), (27) 


where a truncated N-pulse-train approximation of the channel impulse 
response is assumed. 

The even-order moments could be obtained by differentiating equa- 
tion (27) 2k times, but the right hand side expressions could become 
untractable. However, if we differentiate equation (27) once and re- 
group the terms, we obtain the following, 


I 


nae | > tt _ tT) tan wr(to “= er) | . bw), 


t=1 


—f@)-@). (28) 


By successive differentiation of equation (28), a recurrence relation 
can now be obtained. Differentiating equation (28) 2k — 1 times, we 
obtain 


bw) 


6" (0) = = p3 eS 2 Hoop} (29) 


7=1 


where 


2%—-1 


P70) = Sa fle) (30) 





oz 


The power series expansion of tan wr(f, — £7) around origin is 
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tan wr(ts — €T) = wr(ty — £7) + (wr (to 5 tT)) ae 


2 (Oe = 1) 


(2k)! | Bo, | (wr (to a (rye so eee (31) 


+ 


where B,, is the Bernoulli number. It can be seen that 





f, tanur( —¢P)| = [rllo — ery ae 7 Tp a), 
for k = odd positive integers, (32a) 
= 0, 
for k = even positive integers. (82b) 
Thus, 
= Sy | = Baa be 
for k = odd positive integers, (33a) 
= 0, 
for k = even positive integers. (83b) 
where 
N 
Me = > [r(t. — ¢T)}**. (33¢) _ 
Since 
M., = (—1)'8"*(0). (34) 


Combining equations (34) and (29), we obtain the recurrence relation 
for Mox, 


Ma, = {> ec 7 ae D'Maneof*"()p (35) 


t=1 
where f?*(0)’s are given by equation (33a). 
Knowing that 1M, = 1, all the higher order moments can be obtained 
via equation (35) without the knowledge of dF (x). 
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Upper Bound on the Efficiency of 


de-Constrained Codes 


By TA-MU CHIEN 
(Manuscript received June 2, 1970) 


We derive the limiting efficiencies of dc-constrained codes. Given bounds 
on the running digital sum (RDS), the best possible coding efficiency n, 
for a K-ary transmission alphabet, ts » = loge \max/log, K, where \max 
as the largest eigenvalue of a matrix which represents the transitions of 
the allowable states of RDS. Numerical results are presented for the three 
special cases of binary, ternary and quaternary alphabets. 


I. INTRODUCTION 


In digital transmission systems, the transmission channel often does 
not pass de. This causes the well-known problem of baseline wander. 
One way to overcome this difficulty is to restrict the de content in the 
signal stream using suitably devised codes.+? As a result many codes 
having a de-constrained property have been studied.+” The coding re- 
quirement is represented by the constraint put upon the running digital 
sum (RDS) of the coded signal stream. We expect that the efficiency 
of a de-constrained code is related to the limits of RDS in some defi- 
nite way. This is the subject to which we address ourselves in this 
paper. More specifically, we intend to answer the question: What, is 
the best possible efficiency of any dc-constrained code satisfying a 
given limit on RDS? 

Let {a, , @2 , ---} be the sequence of the transmitted symbols, the 
RDS of the signal stream at instant k is defined to be the sum >—*_, 
a; . Taking the RDS at any instant as the state of the signal stream 
at that point, the limits on RDS define a set of allowable states, and 
each additional signal symbol may be considered as a transition from 
one state to another. This transition can be represented by a matrix- 
called naturally the transition matrix. For a K-ary signal alphabet, 
the best possible efficiency 7 of dce-constrained codes is found to be 


ie loge ore 
7 = log. K (1) 


where Amax 1S the largest eigenvalue of the transition matrix. 
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The efficiency of a code is defined to be the ratio of the average bits 
per symbol of the coded signal stream to that of the random (uncoded) 
signal stream. 

McCullough! has derived the same result (1) for the special cases of 
Kk = 2 and 3. His approach is quite different from what will be pre- 
sented in the sequel. 

We first describe in detail the construction of a mathematical model 
for the case of a binary alphabet. Then we generalize the result of the 
binary case to include any alphabet set. Methods of effecting numeri- 
cal calculation are discussed as well as approximation formulas. The 
numerical results for three important cases are presented and known 
codes are compared with the theoretical limits. 


I. LIMITING EFFICIENCY OF THE BINARY CODES 


In this section we confine our discussion to binary signals and direct 
our attenion to the intuitive reasoning which leads to the construction 
of a simple mathematical model and its interpretation. 

Let M, a positive integer, be the desired bound on the RDS of the 
coded binary signal stream. This defines a subset S;,;() of the set 
S(o) of all infinite binary sequences in the following way: An infinite 
sequence is in the subset S,,() if the RDS of the sequence is nowhere 
larger than M or less than —M, ie., |Do'_, a; | S$ M fork = 1,2,---. 
A sequence in S,() is called an allowable sequence. Denoting by 
Ny() and N() the number of infinite sequences in S;,() and S() 
respectively, the average information per symbol for the sequences 
in S,,(©) is given by 


= log. Nis(o) 
7 = log, N() ’ (2) 


assuming the ratio exists. If we interpret the set S(«) as source data 
and S,,;(©) as the transmitted signal, then 7 defined in equation (2) 
is the efficiency of a dce-constrained code which maps one-to-one from 
S() onto Sy,(0).* 

Clearly for any code which satisfies the requirement that RDS be 
bounded by M, the coded signal stream must be a member of Sy,(@). 
Therefore, the set of allowable infinite sequences defined by any code 
satisfying the desired constraint on RDS must be a subset of Sy(0). 


* The puzzle of mapping a large set to a small set can be cleared mathematically 
by observing that the cardinality of both S(oo) and Suw(oo) are that of a 
continuum, and physically by demanding that the transmitter has a higher baud 
than the source. 
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Thus we conclude that the formal expression in (2) indeed gives the 
best possible efficiency for a given bound M/. Our next step is to find 
a way to count the number of allowable sequences in S),(). 

Let us start by counting the allowable sequences of finite length L. 
Define an occupancy vector of the allowable states u, , ’ denoting the 
transpose of a vector (or matrix), 


u, = [war ess Ug ess Usa’, (3) 


where u,, k = —M, --- , M, is the number of allowable sequences 
of length L with their RDS at end equal to k, ie., 0%, a; = k. The 
total number of allowable sequences of length L, N,,(Z) is simply 


AM 


NL) = 2 Ur « (4) 


As L — ©, Nx(L) — Ny() and the total number of sequences 
of length L, N(L) = 2” — N(«). Hence we can rewrite (2) as 
n = lim log. ae ; 


Lo. 


(5) 


Now our job is to find a formula for the number of allowable sequences 
of finite length. 

Suppose we know the occupancy vector u, and we want to calculate 
the occupancy vector uz, . Clearly for any allowable sequence of 
length L + 1, its first Z elements must be one of the allowable sequences 
of length L. We generate, therefore, the allowable sequences of length 
L + 1 from that of length ZL by adding one more binary symbol (-+1 
or —1). Therefore, the sequences of length Z + 1 in the —Mth state 
are generated by adding —1 to the sequences of length Zin the —M + 
lst state; the sequences of length Z + 1 in the —M/ + Ist state are 
generated by adding +1 to the sequences of length Z in the —Mth 
state and by adding —1 to the sequences of length Z in the —1/ + 2nd 
state; etc. It is not difficult to see that the new state occupancy vector is 


r 


Ux Mf +1 
Um + U-a+2 


Urner = | Weare. 1 Umea |° (6) 





Um—2 + Us 


Um-1 
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Equivalently, u,,, can be written as 
Urner = Aoary iz (7) 


where 


1 : ‘ 
Aomst = (8) 


1 
1 0 


is a square matrix of size 2M + 1 with ones in the superdiagonal and 
the subdiagonal and zeros elsewhere. Aga+; 18 the transition matrix of 
the allowable states. By the same reasoning, we have, 


0 


Uy = Asay Uy-1 


Up-1 = Agaryittz-2 (9) 


and 
U, = AsariiUo 


where Uy is the occupancy vector of the sequence of zero length. It is 
defined naturally with one at the zeroth state and zeros elsewhere, 


U = /1)I- (10) 


0 
From equation (9), we obtain, by successive substitution, 


u, = Ascii . (11) 
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The total number of allowable sequences, from equation (4), is 


Nu (L) = PAS is (12) 
where 
J 
bea (13) 
] 


The class of matrices A, defined in equation (8) has many interest- 
ing properties. Their investigation is relegated to the Appendix. 

Using the result derived in the Appendix, and adapting the following 
ordering or eigenvalues of Aogrs1: 


\_mu < A~wr41 < oe ie < \u-1 < Aart » 


we can rewrite equation (12), 


Nu(L) = PDP 'uy (14) 
where 
hw 0 ] 
Ds = c (15) 
0 Md 
bo(d—ar)/P(A~ar) ++ > Go(Aar)/P(ar) 
Pins 10-11) /6(0—a0) 7s 10.10) /Q a0) (16) 


oa(A- a1) /b(A— a0) doar(ar)/b(Aaz) 
from Lemma 7 of the Appendix and (A) and ¢;(A) are defined in 
equation (54) and (70). By straightforward multiplication, we can 
write, from equation (14), 


NAL) = Dee Nib arer(da) 2D, (1), (17) 


i= 


where the normalization constants ¢(A,;) are omitted for simplicity. 
Denote by Amax the largest absolute value of the eigenvalues of A, and 
from Lemmas 8 and 6 of the Appendix we know that 


Nita = Am a —A-a . (18) 
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Then from equation (17) and Lemma 2, 


2M 2M 
N y(L) = Ni arash dX ;(Na1) 2s (=1) *bsrsi xin) pS 60a} 
M-1 2M 
+ DT MibarQ) 27 dis) 


2af 


MeeAy pS (Lb (Hy gba) 


I 


M-1 2M 
-|- pz: Meus 2s :00i)- (19) 


i=—M41 


Since ¢;(A\sr) > O for all 7 (see the proof of Lemma 4), the coefficient 
of the \;, term in equation (19), 


bar+i(Aar) Ze [P+ (-)™"6,0An) > 0 (20) 


independent of L. 
Substituting equation (19) in (5) and using (18), we have 


1 M-1 ON ; L 
lim — {lo \vax E + De (+) : |} 
Loo L t=—-M+1 Nviaz 


n —- 
1 M-1 Ne L 
= loge Amax + lim = log, Mies + > (+) | (21) 
Le L t=-—M+1 ore 


where Zmax and 2; are the coefficients of As, and \,;, 7 ¥ + M in equation 
(19). The second term in equation (21) approaches zero as a limit 
SINCE Zmsx > O. Thus we have the desired result for the binary case 


qos log, Nenad . (22) 


Actually, we have proved a result more general than (22). Observe 
that, in passing to the limit, the crucial point is that z,,., in (21) be 
nonzero. From equation (20) and the fact that ¢;(A.,) # 0 for? S$ 2M, 
we conclude that the particular uy we use, though natural, is immate- 
rial, and any vector with non-negative coordinates will serve the pur- 
pose. Observe also the actual values of the allowable RDS state no- 
where enter into our discussion, hence, it is immaterial whether the 
bound on RDS be symmetric or not. We can consolidate our discussion 
by stating the following theorem. 


Theorem 1: For a binary alphabet, if the RDS of a coded signal 
stream ts required to be within some bound M* and M-, where M* and 
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M- are integers, then the best possible coding efficiency 1s given by 
y= logs Amax 


where Xmax ts the largest positive eigenvalue of the transition matrix A, 
of sizen = M* — M” + 1 as defined in equation (8). 


III. GENERALIZATION TO K-ary CODES 


We now wish to extend the result derived in the previous section to 
an arbitrary K-ary alphabet set, {a1, -*:, ax}. We shall restrict 
ourselves to symmetric alphabets. Namely, if K is even, a; takes on the 
values —(K — 1), —(K — 8), °-:: , -1, +1,°°- , (K -1; if K 
is odd, a; takes on the valucs — (K — 1) /2, — (i — 2)/2, ---, —1, 0, 
1, --:, (K — 1)/2. The transition matrix of allowable states is then 
given by 

ApS De Boe Dy ns (23) 


aizo0 ai<O 
where the size of the matrices is 
n=M*—M +1, (24) 
and M* and M™ are the desired upper and lower bound on RDS. If 


a; = 0 is a member of the alphabet, we follow the usual convention 
that H°? = 1, . The matrices 


0 0 
0 1 
H,, = as (25) 
0 1 
0 
and 
0 0 
F, = 1 0 (26) 
1 . 
0 . : i 


are known as superdiagonal and subdiagonal matrices respectively. To 
see that A, given in equation (23) is indeed the transition matrix, we 
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observe that each symbol a; will generate a sequence in any allowable 
state to a state a; unit away. Hach term in (23) represents the 
transition of states due to a particular alphabet. As an example, tak- 
ing the quaternary alphabet set {—3, —1, +1, +38}, the transition 
matrix is 


0101 ; 
10101 
it. G4 

eri rd = 
iy 0 

1 
0 1010 


With these preliminaries out of the way, we now state a general re- 
sult on the limiting efficiency of de-constrained codes: 

Theorem 2: If the RDS of the coded K-ary signal stream is required 
to be within some bound M* and M-, then the best possible coding 
efficiency is given by 


ae al 27) 
where Amax is the largest eigenvalue of the transition matrix A, de- 
fined by equations (23) and (24). 

Before we embark on the proof of Theorem 2, we need to establish 
an important auxiliary result. 

Let Nx,(L) denote again the number of allowable K-ary sequences, 
then the limiting efficiency y, corresponding to equation (5), is 


_ a logs N (ZL) : 
er eek 


In the set of allowable sequences Sy(L), we can define a subset 
Swia;(Z) by restricting the first symbol to be @; . Similarly we define 
a subset Saja; » — @,(L) by restricting the first two symbols to be 
a, and —a,; . Clearly 


S(L) = Sy(L) » Sab, (L) > Swiaeadl), (29) 
and it follows that 


(28) 


We Wah =D emia (30) 
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where 7.; and 7.;,-a; are the limiting efficiencies given by equation 
(28) with the additional restriction on the leading elements. 

Considering now all the sequences in Syrj;,-2,(b + 2), it is not 
difficult to see that the number of sequences in Sarja;,--2;(L + 2) is 
equal to that in S,,(Z). Hence the efficiency, 


ii log, N (ZL) 


ai, as = h ’ 
4 , aes log, Kk 
cis log, Nar(L) 
= 2 2) logeK a 
= 9. 
Coupled with equation (30), we have shown that 
= Nag = Nai aes (32) 


A little reflection should convince us that any finite pattern at the 
beginning of the sequences does not affect the limiting efficiency 7. 
In other words, the limiting efficiency is independent of starting point 
—a fact we observed in the previous section after the detailed study 
of the transition matrix. This fact enables us to prove Theorem 2 
without going through a tedious mathematical analysis. 

Proof of Theorem 2: The matrix A, defined in equation (23) is 
real and symmetric. It can be diagonalized by an orthogonal trans- 
formation, 1Le., 

A, = PDP. PP = 1 (33) 
where Dy, is a diagonal matrix of real elements \;, +*+, An. 

Using any uy , a constant vector with non-negative elements, we 
can generate a sequence of vectors uz , 

u, = A%u,,forL = 1,2,---. (34) 


Since A, is a matrix with non-negative elements, it is easy to see that 
all u,’s are vectors with non-negative elements. Write 


P= [pips Peer P.] (35) 


where p; is a column vector. From equation (34) we have, using up 
with only a 1 in jth position, 


u, = PD{P'u, 
= Do MpiP: (36) 
i=l 


where p;; is the jth element in p, . 
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Let Xnax denote the absolute value of the largest eigenvalue of A, 
and assume that, in general,* 


My — de Se oe Ne = ied (37) 


are —= p42 ee Nace = Nags 


We can rewrite (36) 


r rt+es n h; L 
UU, = eae DiiPi + (—1)” e>, DiiDi + ys (+) pis} (38) 


t=r+s+] 


Denote by z the first two sums in equation (88), 
z= Dy PisPs + (—1)" pr Dis. (39) 


z must be non-negative for any j and L. If not, then for some large 
enough ZL, u;, will have negative elements, which is a contradiction. 
Since z is a linear combination of p; , --: , pri; , 2 Set of linearly inde- 
pendent vectors, z = 0 only if p;; = 0,2 = 1, --- ,r + s. Furthermore, 
if pj; = Oforallj = 1, --- , n, then the transformation matrix P has a 
row of zeros, which is again a contradiction. Thus we conclude that, 
for some choice of Uy , i.e., for some j, 


n A L 
UU, = rane + ae (=) pip} (40) 


with z non-negative independent of L. The total number of allowable 
sequences is 





n L 
Nu(L) = Max V2 + > (2 pil'ps) (41) 


t=rt+st+1 
Substituting N3;(L) in equation (28), and passing to limit, we get 
equation (27). The proof is now complete. 


IV. NUMERICAL RESULTS AND DISCUSSIONS 


4.1 .Numerical Calculation 

Using the digital computer, the calculation of Amax of any transition 
matrix A, is not difficult except maybe for large n. In the following, 
we discuss several alternative approaches to evaluating Amax, and 
we present results for three important cases. 


*Tt can be shown that r = 1 and s = 0 or 1. But the proof that follows does 
not require this fact. 
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(2) Find Amax by direct diagonalization of the matrix A,. There 
are computer programs developed for this purpose. This is done for 
the binary case and the quaternary case, and the limiting efficiency 
7 is plotted (solid curve) as a function of allowable states n in Figs. 
1 and 2 respectively. 

(zz) In the binary case, the characteristic polynomial ¢,(A) of 
A, satisfies a simple recursive relation (56). Treating (56) as a differ- 
ence equation of ¢,(A)’s, one can express ¢,(A) in an alternate form:* 

sin [(m + 1) cos” (A/2)] (42) 


On) = sin [cos OD) 


The roots of ¢,(A) are, as easily seen in equation (42), 





he = 2 cos = | a Oe ees (43) 














n, LIMITING EFFICIENCY 














fe) 5 10 15 20 25 30 35 40 45 50 
nN, NUMBER OF ALLOWABLE RDS STATES* 


Fig. 1—Limiting efficiency vs allowable states binary alphabet (+1, —1). 
i = M* — M~ + 1, where M* and M- are the upper and lower bound of 
the RDS. 
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y, LIMITING EFFICIENCY 








0 5 i0 45 20 25 30 35 40 45 50 
nN, NUMBER OF ALLOWABLE RDS STATES* 


Fig. 2—Limiting efficiency vs allowable states quarternary alphabet (+3, +1, 
—1, —3). *n = M+ — M- + 1, where M* and M>~ are the upper and lower 
bound of RDS. 


ing A in ga(A) by A + 1. Therefore, one gets Amax of the ternary case 
by adding 1 to the corresponding (same 7”) Amax of the binary case. 
The top curve in Fig. 3 is plotted in this way. 

(tv) From the well-known formula?° 


Nee max x’A,x (44) 
where 
%1 
x= (45) 
Ln 


and the norm of a vector ||z||, is 


lel| = (Sat) 6) 


in 
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we have 


Neiak = max { » SS Lilitas ae oS Dy ritines} (47) 


Ixll=1 Xas=0 t=1 ai<O i=l 


where the a;’s are members in the alphabet set. For example, 


n—-1 
Neha = Max 2 » LXer, (48) 
{xl} =1 a=1 


in the binary case; 
in the ternary case; and 
» max of 5. Liar + BD rasa} (50) 


in the quaternary case. 


APPROXIMATE 
THEORETICAL 








7, LIMITING EFFICIENCY 





0 5 10 15 20 25 30 35 40 45 50 
nN, NUMBER OF ALLOWABLE RDS STATES* 


Fig. 3—Limiting efficiency vs allowable states ternary alphabet (+1, 0, —1). 
ae — M- + 1, where M* and M- are the upper and lower bound of 
the : 
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In this formulation, Amax becomes the extreme value of a quadratic 
form with an equality constraint. There are a number of ways to 
effect a numerical solution. 


4.2 Approximation Formula 
To search for Amax using equation (47) is not an easy alternative, 





but it leads to an estimation of Amax. If we let 71 = % = +°+ = %, 
then, from equation (47), we have 
n— a; Nn + Qa; 
ane = » seat ie + >» aris . (51) 
aiz0 n ai<0 n 


Any other choice of the 2;’s will lead to a different estimate of Amax 
which may be better or worse than that of equation (51). We justify 
the present choice by noting the simplicity of equation (51). Using 
equation (51), we obtain an approximation formula for the limiting 
efficiency 7, 


logs ( Dy Us oe =a > nto) 
n= aiz0 n ai<d nos (52) 
log. K 
where a;’s are members of the K-ary alphabet set. The approximate 
n are also plotted in Figs. 1 to 3 (dashed curve). 
As expected, the approximation is reasonably good for large n and 
it’s for large n that we may have to rely on the approximation formula. 


4.3 Discussion 
(1) It is of some interest to see how the efficiencies of various codes 
with the de-constrained property compare with the limiting curves. 
We have located the following known codes: 
ZDN (Zero-Disparity Binary Code of Block Length N)* and 
LDN (Low-Disparity Binary Code of Block Length N)®* in Fig. 1; 
and 
BP (Bipolar Code)?, n = 2, 7 = 0.63%, 
PST (Paired Selected Ternary Code)’, n = 4, 7 = 0.68, 
BNZS (Bipolar with N Zero Substitution Codes) §, 
n= 4, y= 0.63, 
V1L43 (Variable Length Ternary Code)®, n = 5, y = .84, and 
MS43 (Fixed Length Ternary Code)°, n = 6, 7 = 0.84, in Fig. 3. 


* N = 6. The allowable states for BP are —1 and 0 or 0 and +1 depending 
upon whether the first pulse transmitted be —1 or +1. A similar situation exists 
for PST, BNZS. 
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The BP and BNZ* codes are used in T-1 and T-2 systems”? re- 
spectively and PST is used in the experimental T-4 system.? In 
comparison with their limiting efficiencies, one gets the impression that, 
barring the fact that these codes have other properties in addition to 
the de-constraint, there is some room for improving the coding ef- 
ficiency. VL43, and M843 are examples along this direction. 

It should be pointed out* that the real engineering problem is to 
control baseline wander. A true comparison of different coding schemes 
should therefore be done on this basis. The relation between RDS 
and baseline wander is an elusive one. It depends upon the detailed 
structure of the code in question and the channel to which the signal 
is applied. In terms of RDS, it depends not only on the bounds and 
the distribution of the allowable RDS but also on its dynamics. By 
dynamics we mean the “speed” of moving from one state to another, 
the dynamic behavior is of importance because the channel has 
“failing memory”, so to speak. For example, it can be shown"! under 
certain conditions that a quick jump from one extreme state to the 
other results in a larger amount of baseline wander than would occur 
from staying in an extreme state for a long time. 

(wz) As a general observation, the limiting curves saturate rapidly. 
This implies that, to make a high efficiency code possible, a physical 
system should be designed to operate beyond the fast rising portion 
of the curves. It would be reasonable then to expect that a simple 
ternary block code could be found with 90 percent efficiency or better 
for, say, n = 15. 

(22) An interesting question which arises naturally in connection 
with the limiting efficiency is its realizability. If one accepts infinite 
delay, the answer is affirmative. If one thinks in terms of block codes 
of finite length, then the limiting efficiency cannot be realized. 


V. CONCLUSION 


We have shown that, for a dce-constrained code, the limiting effi- 
ciency is related to the number of allowable RDS states in a very 
simple way. The result is effective in the sense that it lends itself 
easily to numerical evaluation. 

The underlying mathematical fact in our proof is the property of 
non-negative matrices and vectors. Using the theorem of Frobenius 
on non-negative matrices,’° our result can be proved in a few steps. 


_ *The discussion here is heuristic in nature. A thorough treatment of the sub- 
ject is beyond the scope of this paper. 
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We retain our approach for the reasons that it only requires ele- 
mentary knowledge of matrix theory and it gives more insight to the 
problem. 

The technique developed in this paper can be used to investigate 
bounds for some other classes of codes, say timing codes. This will 
be done elsewhere. 
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APPENDIX 


The class of matrices which we want to investigate has the follow- 
ing general form* 


es 
(20: a 
Aco Ae (58) 
. e 
ae a 


Each matrix A, is a square matrix of size n and it has ones in the 
super- and sub-diagonal and zeros elsewhere. 
Let ¢,(A) denote the characteristic polynomial of A,. By definition, 


oA) = det [\1, — Ax] 


A —-l 0 
—1 A —-l 
= det —1 - . . (54) 
=] 
=I] 2% 


The first few polynomials ¢,(A) are 


* A special class of Jacobi matrix.22 
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o(A) = 1, 

bi(\) = A, 

¢(\) =” — 1, (55) 
o3(A) = \° — 2A, 


a(n) = * —_ 3° + 1, 
where ¢o(A) is defined to be 1. The polynomials ¢,(7) have some 
interesting properties. We state them as lemmas. 
Lemma 1: $,(x) satisfies the following recursive relations: 
Forn = 2 
Gn(A) =~ Nbn—-1(A) — $n-2(A), (56) 
and forn > m, 
bn(A) = Pm(A)Pn—m(A) aa Gn—1(A)Pn—m—1(A).- (57) 
Proof: To prove equation (56), we expand the determinant in (54) 
with respect to the first column. To prove (57), we expand the 
determinant with respect to the first m columns and observe that the 


only two nonzero products of minors are of size m and n — m. 
Lemma 2: dn (A) ws an even (odd) polynomial if n 1s even (odd), i.e., 


bn(d) = (—1)"6,(—)). (58) 
Proof: Assume that (58) is true for dn1(A) and ¢n-2(A), then by 
(56), 
—Nbni(=2) — ba(—2), 
(=1)d.a0) — (— D420), 
= (—1)"[\on-1Q) — bn2)], 
= (—1)¢,Q). 


Since (58) is true for ¢o(A) and ¢1(A) by inspection of (55), (58) is 
true for any » by induction. 


I 


Pn — d) 


Lemma 8: If Xo ts @ root of o,(A) and \y ¥ 0, then —Xp ts also a root 
of $,(X). 

Proof: Follows directly from the previous lemma. 

By definition, the roots of ¢,(A) are the eigenvalues of matrix A, . 
Since A, is real symmetric, it follows that all the roots of ¢,(A) are 
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real. Let \,. denote the root of ¢,(\) with largest absolute value. In 


view of the result in Lemma 3, \,. can always be taken to be positive. 


Lemma 4: For any finite n, \,£ of ¢,(d) has the following ordering: 


hee he eee ee (59) 


Proof: We will prove this lemma again by induction. Clearly, ¢,(A) > 
oast— o foranyn = 1, 2, --- . It follows that ¢,(4) > 0 for \ > 
rd,” the largest positive root of ¢,(\). Now, assume that (59) holds for 
NY and A@~”, then, from (56), 


PulXnee ) = A Orie) SOO 
ps he, 


(60) 


I 


Hence ¢,(\) changes sign at least once between X*~" and ». This 
implies that 
Ne he (61) 


Since (59) holds for n = 1 and 2, it holds for any n. 

To show that \,% < 2 for any finite n, we make the observation 
that for \ > 2, the matrix \1, — A, is a dominant matrix,” and there- 
fore nonsingular. For \ = 2, 


(2) = ndi(2) — (n — 1)¢,(2), (62) 


=n+1+0, 


by repeated use of the recursive relation (56). 
Lemma 6: For some number Xo, tf é,(Xo) = 0, then on-1(\o) ¥ 0. 
Proof: Assume the contrary, i.e., 


@n(o) = Pr-1(No) = 0. (63) 
Then from the recursive formula (56), 
@n-2(do) = 0. (64) 
Repeating the same argument, we conclude 
@n(No) = +++ = Gi(Xo) = Go(Ao) = O (65) 


which is impossible. 

Lemma 6: ¢,(d) has only simple roots. 

Proof: Let rX» be a root of d»(A), then the matrix [Aol, — An] is 
singular. On the other hand, from the previous lemma, [Aoln-1 — 
An1] 18 nonsingular. Hence the null space of the matrix [Aol, — An] 
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is one-dimensional. From the fact that A, is diagonalizable, we con- 
clude that A» must be a simple root of dn(A). 
Lemma 7: Write 


A= PDP (66) 
where 
Da = diag [\i,-°° , Aa] (67) 
and 
PP’ =1; (68) 


then in general, P can be expressed in terms of $(X)’s 


o(Mr) — Bo(Ao) Pon) 











(Ai) $(A2) (Ax) 
Piss (69) 
Pn-11) — Pn—1 (Az) Pn—1(An) 
$(A1) (Az) (An) 
where 
g(r) = bs #09 | (70) 


ts a normalization constant. 


Proof: Let x be an eigenvector corresponding to an eigenvalue 
dof A,, 


A,X = XX. (71) 
Write 
x = [x,,--° , 2a)’ 
and expand (71), we have 
Ho = AN, 


X3 = M2 —- M1, 


U4 = AX aoe Lo 


Ln = AVn-1 = Un-2 ’ 
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a a 


Delete the last equation in (72) and then compare with equations (55) 
and (56). We can make the following identification: 


X1 = (A), 

v2 = (A) 

(73) 
Ly = Pn-i(A). 


To normalize x, we divide (73) by the inner product of x. Denoting 
the inner product by $(A), 


o() = bs #0 | 


the normalized eigenvector corresponding to the cigenvalue A is 


given by 
o() #10) 7 
BOOT 3g ee 5 
Es 60) eC 
It is well known that eigenvectors corresponding to different eigen- 
values are orthogonal. Therefore, for P as defined in (69), P’P = 1,. 
Since each column of P is an eigenvector of A,, it follows that 


A,P —_ PD, 


and equation (66) is immediate. 


4 
2 
’ 


(74) 
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Pull-In Range of a Phase-Locked Loop 
With a Binary Phase Comparator 


By JAMES F. OBERST 
(Manuscript received May 25, 1970) 


We develop a method for calculating the pull-in range of a phase-locked 
loop with a binary phase comparator and an arbitrary loop filter. Complete 
numerical results are presented for loop filters of the phase-lag and low-pass 
types. The problem of stability is also considered, and it 1s proved that with 
these loop filters no steady-state phase jitter can exist after frequency 
acquistion has been achieved. 


I. INTRODUCTION 


The phase-locked loop (PLL) is an important element of many 
modern communication and control systems. A PLL block diagram is 
shown in Fig. 1. The input v; (wot + 6,) is a narrow-band signal with 
carrier frequency wo and phase 6,(t). This phase is compared with 
the phase @2(t) of the voltage-controlled oscillator (VCO) in the phase 
comparator (PC). The PC output f(¢), where d = 6, — 62, is filtered 
by the loop filter H(p) and applied to the VCO control terminal. 

Depending on the values of the PLL parameters, the phase error ¢ 
can be kept small even with input phase modulation. Thus with 6, (¢) 
= Ot + 610, which represents a constant input frequency offset, the 
system can produce a synchronized signal vo(t) with frequency wo + 
Q. This synchronization capability leads to PLL applications in carrier 
extraction,’ frequency synthesis,? narrow-band filtering,? FM demodu- 
lation,’ timing extraction in PCM and data transmission systems,‘ etc. 

In this paper, we examine the acquisition, or pull-in, range of a 
PLL with a binary PC. We present numerical results for the special 
case of a second-order PLL with either low-pass or phase-lag loop 
filter. 

The PC characteristic considered here is the binary curve shown in 
Fig. 2. It is of interest in at least three situations. First, since many 
synchronization systems are designed to operate with very small 
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v{Cwot+ 41) 


= 0, = Oo 
Fig. 1—PLL block diagram. 


phase errors, dynamic range limitations in the PC circuitry often pro- 
duce severe saturation effects. The characteristic of Fig. 2 corresponds 
to the extreme case of vanishing linear range near zero phase error. 
However, it is a useful approximation for systems with small but non- 
zero dynamic range for the purpose of studying pull-in performance. 
Second, a binary PC can be easily implemented with logic circuits. 
The resulting characteristic differs from the ideal of Fig. 2 by exhibit- 
ing small hysteresis about the zeros at ¢ = nz, but this hysteresis has 
no effect on the pull-in range achieved. Finally, J. J. Stiffler® has 
shown that for a first-order PLL with additive white gaussian noise 
and no frequency offset, the cross-correlation type PC which mini- 
mizes Pr {|¢| > ¢o} for all $9 has the characteristic of Fig. 2. 
Although such a square-wave correlation function is unrealizable, 
this result suggests that PLLs employing other types of PC having 
this characteristic are worthy of consideration. In addition, similar 
“bang-bang” control characteristics are known to be optimum for 
PLL acquisition.® 


II. CALCULATION OF PULL-IN RANGE 


The phase model corresponding to the PLL block diagram in Fig. 
1 is shown in Fig. 3. We assume that the gain of the loop filter H (p) 
is unity at DC. The input-signal frequency differs from the VCO center 
frequency by © rad/s: 


0,(t) = Qt + Apo. (1) 
It is convenient to normalize the detuning © to the de loop gain®* a: 
y = Q/a. (2) 


* Since no gain can be defined for the binary PC being considered, the symbol 
a does not represent the usual small-signal loop gain. 
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F(¢) 





Fig. 2—Phase comparator characteristic. 


When the normalized detuning y is not too great, the VCO frequency 
changes toward the input frequency, and eventually the PLL synchro- 
nizes to the input signal with zero frequency error and finite phase 
error. The normalized lock range y;, is the maximum detuning | | 
for which the PLL can remain locked after synchronization has been 
established. Inspection of de conditions in the phase model of Fig. 
3 shows that yz = 1. The normalized pull-in range yp is the maximum 
|y | for which eventual synchronization is assured from any initial 
conditions in the loop filter and VCO. In general yp < 1 for PLLs of 
order higher than first. The order of a PLL is defined as one plus the 
number of poles in H (p). Calculation of yp is the subject of this paper. 
The method employed here is similar to that used by A. J. Gold- 
stein’ to calculate yp for a PLL with a sawtooth PC. Due to the binary 
nature of f(¢@), the PC output waveform f[¢(t)] is piecewise constant, 
assuming only the values +1. Assume that the PLL is not synchronized 
to the input signal. Then ¢(¢) increases with time (for Q > 0), and the 
waveforms #(t) and f[¢(t)] appear as shown in Fig. 4. The time ori- 
gin has been selected so that ¢(0) = 0. The transition instants are 


0 = be < ti < be Cote < ty < tye < eee (3) 





6,;=2C+4%0 


Fig. 3—Phase model of PLL. 
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#(t) 





Fig. 4—Phase and PC output waveforms. 


where 
o(t;:) = (27 — 1)r (negative transition), 
o(t;2) = 2jr(positive transition). (4) 


An expression can be written for f[¢(¢)] by summing all of the seg- 
ments corresponding to 27 increments in ¢(t): 


fie] = Yo lt = 4s.) — l=) Fu 4d. 


The 7 = 1 segment is shown crosshatched in Fig. 4. Since the PLL is 
not synchronized to the input signal, the steady-state PC waveform 
fes{@(t) ] 1s periodic, and the transition instants can be written: 


i(T's a 24); 
i(T3 =f T's) — Ts . 


Ts and 7’, are the times between transitions as indicated in Fig. 4. 
Using equation (6), equation (5) becomes: 


tie 


I 


(6) 
i 


i) 


flo] = do ut -— — UIP: + 7s) 


=—00 


= 2b ist) Tey) Pate glee Tie 
From equation (1) and Fig. 3, the phase error in steady state is: 
bull) = O-+ by — fulo(O] * oso | (8) 


where ¢o is some constant. Since f,s[¢(t)] is composed only of step 
functions, the last term in equation (8) can be written as a sum of 
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delayed functions g(t), where 
2{9(} = Ge) =P. (9) 


The general expression for ¢ss(¢) is then 


én) = WM +b.—-a DS Wo — GU, + TD) 


— 2g — 17, + T.] + T.) + g(t — iT + T4))]. (10) 
From equation (4) and Fig. 4, we have the following conditions on 
ss(t): 
$53(0) = 0, 
;;(1's) = 7; (11) 


bs3(1'3 =F T's) = 27. 
These conditions are sufficient to determine the unknown constants ¢p , 
T;, and T, in equation (10). Thus equations (10) and (11) together 
define the relationship between the normalized detuning y and the 
loop-filter parameters [through g(t)] which must, be satisfied for the 
PLL to be out-of-lock in the steady state. Then clearly yp is the 
minimum value of y for which these equations possess a solution. 


III. RESULTS FOR SECOND-ORDER PLL 


In this section, the method derived above is applied to the second- 
order PLL with loop filter 


dae op: 


This is a phase-lag filter for0 < Tz. < T;. It becomes a simple low- 
pass filter for JT, = 0, and setting 7, = JT, reduces the PLL to first 
order. The corresponding g(t), from equation (9), is 


gt) = [é — @, — 7.) — exp (—#/T,)) Ju). (13) 


In Appendix A, equation (10) is rewritten using equation (13) and is 
evaluated at = 0, T3, and T3; + T,. Equation (11) is then applied, 
along with the normalization 


7, = aly, a= 1,2, 3, 4, (14) 
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The equations which result are 


a BIE Fai he 


T3 is v2) 


A(r, coe T2)(T3 — T4) 


= [ra(r3 + 7) + 73(74 — | eoth a + coth | . (16) 


According to the discussion following equation (11), the pull-in range is 


Ba hes ea Q2r + tT; — Ts 
i T3,Te>O T3 ote ta 

subject to the constraint equation (16). It is important to note that 
equation (15) is simply the de balance equation for the PLL model 
of Fig. 3, and therefore holds for any H(p) with unity de gain. Since 
equation (15) gives y explicitly in terms of rz and 74, it can always 
be used to eliminate y from a constraint equation corresponding to 
equation (16). Therefore y, can be calculated from equation (17) 
for any H(p) subject to the appropriate constraint equation which 
relates the loop-filter parameters to the transition-time parameters 
vz and +4. Hence only ¢,.(0) and ¢ss(7'3) actually had to be evaluated 
in Appendix A. 

The method employed to evaluate y, for various values of 71 and 72 
was to choose 73 > 0, use equation (16) to obtain the corresponding 7r,, 
and calculate » from equation (15). The 73, v4 relationship was found 
to be single-valued for all loop filters investigated. Examples of the 
behavior of y with 7s are given in Fig. 5. The filter parameter r is 
defined as 


(17) 


pe Tail (18) 


In all cases the curves y(73) are smooth and exhibit a single local min- 
imum. This minimum is found by computing a sequence y(73n), where 
T3n > T3n-1- When this sequence begins to increase, the minimum y, 
has just been passed, and can be estimated accurately from the last 
three computed values of y. 


Curves of y, versus 7, with r as a parameter are presented in Fig. 6. 
Several characteristics are notable. First, y, = 1 for r = 0.5 indepen- 
dent of 7; . Second, as 7, increases with r constant, y, approaches an 
asymptotic value y,, which is a function only of r. Later we shall derive 
an explicit formula for y,, . The same results are presented in a differ- 
ent way in Fig. 7, which shows curves of constant 7, on the 7 , 72 
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2.0 
LABELS: 7,, P=To/7, 
1.5 
1, O.! 
1.0 1,0 
7 
10, 0.1 
10,0 
100,0.1 
0,5 
100, 0 
(e) 
fe) 10 20 30 40 50 60 
73 


Fig. 5—y versus 73. 


ASYMPTOTE: y~a2V r(i-r) --3 


Pr=0.5 
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Fig. 6—yp versus 7: with parameter r. 
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Yp = 1 REGION 


Ge Vee 
Tg TS [\-e-4 ar: 





{ 2 4 6 8 10 20 40 6080100 200 400 


Fig. 7—rez versus 71 with parameter yp. 


parameter plane. Below we derive the equation for the curve in this 
figure corresponding to y, = 1. 
- IV. FURTHER RESULTS 


Let us consider the important case of very large 7,, which corre- 
sponds to strong filtering in the PLL. Noting that 


Lim coth x = 4 (19) 
z-0 x 
and using equation (18), equation (16) becomes for large 7, : 
274 271 
47,1 —_ r) (73 oT T4) = [ra(73 = Tr) a T3(T4 mig m)| — =p ane ° (20) 
3 4 
This simplifies to 
eae ee (21) 


~ 2rt3 + T 
Substituting equation (21) into equation (15) gives the result for y 
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with 7; large: 


_ rr. + 2arrz, + © 





= : 22 
rr, + 41T3 22) 
All minima of y must satisfy 
dy _ ; 
dt3 - 0. (23) 


Applying equation (23) to equation (22) yields a single positive value 


of T3: 
-|1 + (4 a 1) | 
r 


age WSS, (24) 


Substituting equation (24) into equation (22) gives the result 


es a — r)]’, 0<r< 0.5, (25) 


1, r= 0.5. 


This agrees with the numerical results in Fig. 6 and with a result 
obtained by M. V. Kapranov by a different method in an untranslated 
Russian paper.® The existence of only a single minimum of y(r3) for 
large 7; supports our hypothesis of a single minimum for all +, which is 
based on the curves in Fig. 5. 

The region y, = 1 in Fig. 7 corresponds to 7, , 72 such that y = 1 
for all 7; , 7, . From equation (15), this implies that 


TS. (26) 


Thus from equation (16), 71, 72, and 73 on the y» = 1 boundary must 
satisfy 


A(t, — 12)(t3 + wm) = alts + 2)| eoth -3 4 ¢oth |. (27) 
277 271 


Examination of Fig. 5 shows that the critical y curves approach y = 1 
from above for very large r3. Using 


Lim coth x = 1, (28) 


ro 


equation (27) becomes 


A(t, — T2)T3 = rf + coth Z| . (29) 
AT 
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This reduces to the simple expression for the yp = 1 curve: 


=n — 5 / UL — exp (-#/n). (0) 


Finally, let us consider the following problem. The periodic behavior 
of ¢.s(t) assumed throughout this paper which led to equations (15) 
and (16) is known as a limit cycle of the second kind in the phase 
plane.® The nonexistence of such limit cycles for |y| < yp proves that 
frequency lock is eventually attained for these values of y. Physically 
speaking, ¢..(¢) cannot increase monotonically with time as assumed 
in Fig. 4. However, proper synchronization of the PLL requires that 
phase lock also be achieved. This means that after a long enough time, 
the system comes to rest: 

Lim $(f) = nr, ly |<%- (31) 

£900 
Because of the gross nonlinearity f(¢) in the system considered here, 
it is not obvious that equation (31) will be satisfied. Specifically, it is 
conceivable that a series of self-sustaining overshoots in ¢(¢) could 
become established after pull-in which would produce a periodic phase 
jitter. This behavior corresponds to a limit cycle of the first kind in the 
phase plane. Although this problem is not solved in general here, a 
test which is valid for any H (p) is applied to the phase-lag filter case 
in Appendix B. It is found that in this case, phase lock described by 
equation (81) is always achieved. 


Vv. CONCLUSION 


A method has been presented for calculating the pull-in range y, of 
a PLL with a binary phase comparator and an arbitrary loop filter. 
The result is obtained as the minimum value of a function of two 
variables, subject to a constraint equation which relates these variables 
to the parameters of the loop filter. Complete numerical results for 
yp were obtained for loop filters of the phase-lag and low-pass types. 
Explicit formulas were given in this case for the asymptotic value of 
yp with strong loop filtering, and for the set of filter parameters which 
result in unity pull-in range. Finally, it was proved that no steady- 
state phase jitter can exist after pull-in with these loop filters. 
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APPENDIX A 


Evaluation of dss (t) 
Consider one period of ¢,;(¢), and define the functions ¢,;(t) as: 


(i), OStST,, 
tu = $00 (32) 
¢2(t), T, Sts7T7,1T7,. 


From Fig. 4, ¢,(t) includes all terms in equation (10) which correspond 
to transitions prior to t;; = 73. Using equation (13) in (10), ¢1(t) 
becomes: 


y(t) = 2t+¢—-—a DY [t = Qj ea 1)(73 = T's) — (T, Ee T>) 


i=-0 


exp i= = Nie 2a) Ty 


+ 20 D0 [t-— (Ps +7) +7. — Cy — 7) 


=o 


-(1 — exp [-@ — (IT, + T.] + T.)/T:))] 


-a Y= i+ 2) - , - 19 


=—00 


“(1 — exp [—(¢ — gf's + T,))/T1))]. (33) 
Absorbing all constant terms into ¢o , equation (33) becomes: 


di(t) = Qt + go — at 
— a(T, — T2) exp (—2t/T,) pz} [exp [j(73 + T4)/T,] 
— 2 exp [G[T3 + Ts] — T.)/T1] + exp [(T73 + Ts)/Ti]], 


= 21+ 6 — at — 2a(T, — T.) 
— exp (—T,/T;) 


OCI emma ToT | 
¢), is eliminated from equation (84) using 
¢:(0) = 0, (11a) 
which yields 
di(t) = (Q — a)t + 2a(T, — T2) 
le (1 — exp(—#/T)). (85) 
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Now requiring that 
o:(7s) = 7 (11b) 
results in 
r= (Q —a)T, + 2a(T, — Te) 


a — exp (—T;/T,))Q — exp (—T,/T;)) 
1 exp ([—-@ s+ Ta)/Fil 


which after some manipulation becomes 





(36) 


P(e ar aga aT / | eoth x a eee £.). G0 


Next, ¢o(t) is obtained from equation (35) by adding the term in 
equation (10) which corresponds to the transition time ¢t;, = 7's: 


b(t) = (Q — a)t + 2a(T, — T) 


1—- exp (—T.,/T)) my 
1 — exp [—(T Te P/F. dd — €xp (Sy) 1)) 


+ Zab Ts) — 2a. — Tad — exp. [—@ —-T3)/7 i). (38) 
Requiring that 





$2(T3 + Ty) = 20 (11c) 
and performing some simple manipulation leads to the result 
20 = Q(T; oa Ts) — a( Ts; == Ta): (39) 
Equations (87) and (89) can be normalized by letting 
y = Q/a, (2) 
Tt = al,, q= 4, 2.354, (14) 


Equation (39) becomes 


Pte aid ava 
ts TP Ts 


which is equation (15) in Section III. Equation (87) becomes 


y=1t LI, — A(r, - 79 / | oom $2 er sal; oo 


Eliminating y between equations (15) and (40) gives the constraint 
equation (16). 





(15) 
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APPENDIX B 


The Phase Jitter Problem 


In this appendix it is proved that whenever the second-order PLL 
studied here achieves frequency lock, it also achieves phase lock: 


Lim ¢() = 2nr, [vy |<. (31) 


It can be shown that for this system any steady-state phase jitter $(¢) 
is confined within the +7 neighborhood of some lock point ¢ = 2nz. 
This is a result of the periodicity of the phase-plane geometry in the ¢ 
direction. However, since this proof requires a rather lengthy description 
of the properties of the phase-plane trajectories, it will be omitted. 
Below we prove that equation (81) holds when the phase error remains 
within such a --7 neighborhood of a lock point. 

The technique used previously to calculate y, can also be employed 
here. Assume that in the steady state, a periodic phase jitter ¢(¢) 
exists. Then the waveform f[¢(¢)] is binary and periodic, and the phase 
error is again given by equation (8). Since we are now considering a 
phase jitter within +2 of ¢ = 2nz, the requirements on ¢,,(t) are: 


$,4(0) = dss(7's) = dss( T's + T's) == 2nir. (41) 


Proceeding as in Appendix A, we obtain the equations 





T3 — T4 
=e 2 
Y T3 + T4 ) (4 ) 
2(7, — T2)(t3 + 74) = ror coth a + coth x |. (43) 
1 1 


These equations may be written directly from equations (15) and (16) 
by replacing z with 0. Below it is demonstrated that 73.-= 74 = 0 is the 
only solution of these equations when |y| < 1, which proves equation 
(81). 

Using |y| < 1 in equation (42) yields 


T3 7, > 0, (44) 


so that only equation (48) must be considered. Using r of equation 
(18) and dividing by 7374 gives 


1 1 T3 T4 
2 Bp = eS aes cea 
7(1 n(2 + 4) coth Or, + coth Dr, (45) 
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Defining 
ta5t, j= 3,4, (46) 
27; 
equation (45) becomes 
1 1 
(a — n(4 a. | = coth x; + coth x, . (47) 
ar Bs 
Recalling that r = 0, we have for x > 0: 
ches Se (48) 
x x 
Thus the only possible nonnegative solution of equation (47) is 
tz = %, = 0 (49) 


which is the desired result. 
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A Fast Method of Generating 
Digital Random Numbers 


By C. M. RADER,* L. R. RABINER and R. W. SCHAFER 
(Manuscript received June 12, 1970) 


In this article we propose a fast, efficient technique for generating a 
pseudorandom stream of uniformly-distributed numbers. The arithmetic 
operations required are an L bit exclusive-or, a rotation, and a shift to 
update the state of the number generator. With moderately large values of 
L we have been able to generate sequences of numbers whose periods are 
quite long (on the order of 2 X 10° long). Its simplicity of construction, 
as well as its ability to generate long streams of independent pseudorandom 
untformly-distributed integers make this noise generator a worthy candidate 
for use in high-speed digital systems. 


I. INTRODUCTION 


Almost all methods of generating digital random numbers use as 
input a set of previously generated random numbers which were pro- 
duced by an iterative arithmetic process (modulo a large integer)—e.g. 


X, = F(X,-1, Xa-2,°°* , Xa-z)mod N n=0,1,--:- 


with initial conditions 


X-; = C; ’ 

> =. C, ? 

Xs — C; ° 
For each new random number, X,, this arithmetic process is repeated. 
The integer N and the initial values Ci, C2, -::, Cy are chosen to 


* Mr. Rader is with Lincoln Laboratories, Massachusetts Institute of Tech- 
nology, Lexington, Massachusetts. Lincoln Laboratories is operated with sup- 
port from the U. §. Air Force. 
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guarantee a large period of repetition. Methods of the type described 
above involve a considerable propagation delay, representing at the 
least one addition or one multiplication time, between the time the 
nth random number is put into its storage register, and the time at 
which the (n + 1)st random number is available. This delay is not 
generally a problem in most applications, because computational de- 
lays in other parts of most digital systems are far greater than those 
encountered in generating random numbers. However, it is conceivable 
that in the future a need will arise for which random numbers must 
be computed far more rapidly than is now necessary. With such a 
time in mind we propose the following algorithm, for which the time 
necessary to produce a new random number is equal to the sum 
of a flip-flop settling time, and the propagation delay of an exclusive- 
or gate. 


II. THEORY 


The algorithm for generating the Z-bit random number K: from 
the two previous L-bit numbers X,_; and X,_2 may be stated as 


Xn om Tp(Xn-1 ® X n-2) 


where T'p(-) denotes a cyclic rotation of P places to the right and @ 
denotes exclusive-or. The algorithm requires L flip-flops to store X,_; 
and L flip-flops to store X,-. . Each bit of the new random number X,, 
is derived by an exclusive-or operation on a pair of corresponding 
bits in the previous random numbers. These bits are not stored in the 
bit positions from which they were produced, but instead each new 
bit is rotated cyclically to the right by P bit positions, with overflow 
on the right being fed into the left. The process is illustrated in Tig. 1 
for P = 1. At each cycle, the new random number generated (X,) 
is clocked into the lower flip-flops (X,,_,); at the same time the number 
stored in the lower flip-flops (X,-;) is clocked into the upper flip-flops 
(X,-2). For maximum period, P should be chosen mutually prime to L. 
Otherwise, the bit rotation may be shown to be composed of several 
interleaved rotations of shorter words. As seen from Fig. 1, the bit 
rotation does not constitute a separate hardware operation. In physical 
terms, the exclusive-or of two flip-flops in corresponding bit positions 
is clocked into a third flip-flop while one of the pair of flip-flops is 
clocked into the other. An example of the process is given in Table I 
for L = 3, P = 2. (The starting values of X_, = 000, X_. = 001 are 
used here.) The period of this generator is 15. It is easily shown that 
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FLIP-FLOPS zi 
STORING ae 
Lp 
n-2 


EXCLUSIVE - 

OR GATES 

PRODUCING 
en 


FLIP-FLOPS 
STORING —» | FLIP- 
* FLOP 
n-4« 


Fig. 1—Schematic diagram showing how the random numbers are generated 
and stored. 





with the output depending on the state of six (2Z) flip-flops, the maxi- 
mum theoretical period of any random number generator is 64, or in 
general (2°”), and the maximum period of any random number generator 
using only flip-flops and exclusive-or elements is 63, or in general 
(2° — 1), since the state when all the flip-flops are zero is succeeded 
only by itself. 

Even though the period of the generator of Table I, and the periods 


Tasie I—TypicaL OurpuT SEQUENCE FoR NoISE GENERATOR 
with L = 3,P = 2 














NXn-1 Xn-2 Xn 
New Random 

Clock Number Most Recent Next Most Recent Number 
0 000 001 010 
1 010 000 100 
2 100 010 101 
3 101 100 010 
4 010 101 lll 
5 111 010 011 
6 011 111 001 
7 001 O11 100 
8 100 001 O11 
9 O11 100 lll 
10 411 O11 001 
Il 001 lil 101 
12 101 001 001 
13 OO1 101 001 
14 001 001 000 

(15) repeats] repeats) repeats) 

000 } 001 | 010 | 
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for other values of LZ, are small fractions of the theoretical maximum, 
it is possible to choose ZL to obtain a very long period. It is also pos- 
sible, as we shall see, to combine the results of several generators to 
get still longer periods. We have theoretically predicted all the word 
lengths (LZ S 25) for which very short periods result*, and we have 
measured the periods associated with the remaining values of L. The 
longest periods result for word lengths of L = 11, 18, 17, 19, 22, 23 
and 25. For 25 bits, the longest period results, and is 17,825,775. How- 
ever, since 2” is about 3.3 X 10’, not all the possible 25 bit numbers 
appear at the output of the generator. The computation of the period 
assumes that the word is rotated a number of bits mutually prime to 
the word length (e.g., P = 1) and that the starting states are reasonable 
(e.g., X_-. = 0, X_. = 1). There exist unreasonable starting states, such 
as all zeros in one word and all ones in the other word, which have 
much shorter periods than those prescribed, but these can be avoided 
in all cases by restricting the starting values to 0 and 1. 

Table II lists the periods of the generators for values of L from 1 
to 25, as well as the factorization of these periods. The factorization 
is useful in predicting the periods associated with generators made up 
of two or more of these simple generators with interleaved bits. For 
example, it 1s possible to generate a 48 bit random number by inter- 
leaving the bits of a 25 bit word with the bits of a 23 bit word. The 
periods of the two generators may be found to have only the prime 
factor 3 in common, as seen from Table II. Thus the joint period (the 
least common multiple) is 4 of the product of the periods of the 
individual generators, resulting in a period of about 2 x 101°. Sim- 
ilarly a 24 bit word could be made up of an 11 bit word and a 18 bit 
word, with a joint period of about 2 xX 10°. The disparity of prime 
factors among several interesting cases seems surprisingly fortuitous. 


Ili. TESTS FOR RANDOMNESS 


It is clear that a long period does not by itself indicate a good ran- 
dom number generator (e.g. the iteration r, = 7,_, + 1 would have a 


*It is possible to equate any bit (as a function of time) to the mod 2 sum 
of the same bit delayed by various amounts. For example, with L = 3 the equa- 
tion for any bit of the 3 bit word is: 


ba = ba-s ® bn-s @ ba-s <0) b n-s- 


This equation describes a particular 6 bit shift register with feedback. The 
analysis of such shift registers is described in Ref. 1. Specifically it is possible 
to obtain the maximum period associated with a given shift register, and there- 
fore with a given random number generator, by pera the factors of certain 
characteristic polynomials over the Galois Field mod 2 
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TABLE [I—TuHE PERIOD AND Its DECOMPOSITION INTO ITs PRIME 
Factors ror NoIsE GENERATORS WITH VALUES 
or ZL rrom 1 To 25 











L Period Factors 
J 3 3 
2 6 (2) (3) 
3 15 (3) (5 
4 12 (2)? (3) 
5 255 (3) (5) (17) 
6 30 (2) (3) (5) 
7 63 (3)? (7) 
8 24 (2) (3) 
9 315 (3)? (5) (7) 
10 510 (2) (3) (5) (17) 
11 33825 (3) (5)? (11) (41) 
12 60 (2)? (3) (5) 
13 159783 (3) (18) (17) (241) 
14 126 2) (8)2 (7) 
15 255 (3) (5) (17) 
16 48 2)4 (3 
17 65535 (3) (5) (17) (257) 
18 630 (2) (3)? (5) (17) 
19 14942265 (3) (5) (13) (19) (87) (109) 
20 1020 (2)? (3) (5) (17) 
21 4095 (3)? (5) (7) (18) 
22 67650 (2) (8) (5)? (11) (41) 
23 4194303 (3) (23) (89) (683) 
24 120 (2)8 (3) (5) 
25 17825775 (3) (5)? (11) (17) (81) (41) 





very long period, but would be unacceptable to most users). A better 
indication of the acceptability of a random number generator is the 
autocorrelation function of the output of the generator, R,(n). It is 
difficult to obtain R,(n) theoretically for the generators described 
here; so instead we have estimated R,(n) by standard techniques for 
several cases of interest. In Fig. 2 we show the estimated autocorrelation 
function for the generator with Z = 13. Approximately N = 15000 
samples were used in the estimate, and Fig. 2 shows the results for up 
to 512 delays. It seems clear that there are no irregularities present in 
the autocorrelation function. The peak values of the autocorrelation 
function of Fig. 2, for n ~ 0, are +0.026. It is easily shown that esti- 
mates of the autocorrelation function (for n ~ 0) tend to be normally 
distributed random variables with zero mean, and variance of 1/N. For 
the data of Fig. 2, the standard deviation, «, was calculated to be 0.008. 
Cramer’ shows that the expected value of the upper extreme of 512 
values from a normal population with zero mean, and standard deviation 
of co, is approximately 3.25 o, or 0.026 in this example. The 50 percent 
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Fig. 2—Autocorrelation function of noise generator with L = 18. 


confidence interval for the upper extreme is about 0.4 o wide, thus peak 
values of the autocorrelation estimates from 0.024 to 0.028 would be 
quite common. Therefore the peak values of +0.026 in Fig. 2 are not 
inconsistent with the above theoretical results based on a true normal 
population. 

To empirically test the uniformity of the output of the generators 
for L = 11 and L = 138, we measured the number of occurrences of 
each of the 2” output states during a single period. For both cases all 
states were present at the output of the generator. In Fig. 3, we show 
plots for LZ = 11 and 13 of the number of occurrences of each of the 
128 cells specified by the 7 most significant bits of the output as a 
function of cell number. The upper plot shows the measured result 
for L = 11, and the lower plot shows the measured result for L = 18. 
The solid lines across the plots indicate the expected number of oc- 
currences of each state based on uniformity assumptions. The plots 
of Fig. 8 tend to validate the assertion that the amplitude distribution 
of the output of the noise generator is uniform for certain values of L. 

We have also measured the mean value for the entire sequence for 
LE = 11 and £ = 18 and it is near to 2"7 if the L bits are interpreted 
as a positive integer. The nearest means obtained were 1024.3169 for L 
= 11 and 4095.8326 for Z = 13. (Starting values for the two sequences 
were X_; = 341, X_2 = 0 for L = 11, and X_; = 151, and X_2z = 0 
for L = 18.) Also, it was found that a scatter diagram from the gen- 
erator with L = 17 showed no tendencies to order, such as are common 


Nonorthogonal Optical Waveguides 
and Resonators 


By J. A. ARNAUD 
(Manuscript received May 4, 1970) 


The modes of propagation in optical systems which do not possess 
meridional planes of symmetry (nonorthogonal systems) are investigated 
an the case where the effect of apertures and losses can be neglected. The 
fundamental mode of propagation 1s obtained with the help of a complex 
ray pencil concept. An integral transformation of the field, based on a 
quasi-geometrical optics approximation and a first-order expansion of the 
point characteristic of the optical system, is given; it shows that the complex 
(three-dimensional) wavefront of the fundamental mode is transformed 
according to a generalized “ABCD law.” A simple expression ts also 
obtained for the phase-shift experienced by the beam. The higher order 
modes of propagation are obtained from a power series expansion of the 
fundamental mode. These higher order modes are expressed, in oblique 
coordinates, as the product of the fundamental solution and finite series of 
Hermite polynomials with real arguments. In the special case of systems 
with rotational symmetry, these series reduce to the well-known generalized 
Laguerre polynomials. The theory is applicable to media such as helical 
gas lenses and optical waveguides suffering from slowly varying deforma- 
tions in three dimensions. Nonorthogonal resonant systems are also in- 
vestigated. An expression for the resonant frequencies, applicable to any 
three-dimensional resonator, ts derived. Numerical results are given for the 
resonant frequencies and the resonant field of a twisted path cavity which 
exhibits interesting properties: the usual polarization degeneracy 1s lifted 
and the intensity pattern of all of the modes possesses a rotational symmetry. 


I, INTRODUCTION 


An optical system, or a resonator, is called “nonorthogonal” when 
it is not possible to define two mutually orthogonal meridional planes 
of symmetry (Ref. 1, p. 240). The helical gas lens** is an example 
of a nonorthogonal lenslike medium. A conventional ring type cavity 
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generally ceases to be orthogonal when its path is twisted, i.e., be- 
comes nonplanar.* 

Let us briefly review the major approaches in the theory of optical 
resonators. The field in a resonator can be expressed exactly in 
terms of known functions only for a few simple boundary surfaces. 
No exact solution is available for nonorthogonal systems. However, 
we are interested only in the high frequency operation of large 
resonators. In that limit, the waves have a tendency to follow closed 
curves in the resonator, either clinging to the concave parts of the 
boundary (whispering gallery modes®) or connecting opposite points 
of the boundary (bouncing ball modes). One defines the axial mode 
number as the number of wavelengths existing along such closed 
curves. The nodes of the field in the transverse planes define the 
transverse mode numbers. More insight concerning the mode structure 
and the resonant frequencies can be gained by using a geometrical 
optics approximation, or a paraxial form of the Huygens diffraction 
principle. The geometrical optics approach was developed by Keller 
and Rubinow.® It consists of setting up in the resonator a manifold 
of rays tangent to a caustic. The location of the caustic and the 
resonant frequencies are obtained from the condition that the varia- 
tions of the eikonal along three independent closed curves are equal 
to an integral number of wavelengths (or an integer plus one-half 
or one-quarter). This theory, which is analogous to the Born ap- 
proximation of quantum mechanics, gives the exact resonant fre- 
quencies of paraxial modes. The geometrical optics field, when ex- 
tended in the shadow of the caustic by analytic continuation, provides 
an acceptable approximation to the exact field for large transverse 
mode numbers but, for the fundamental mode, it differs vastly from 
the exact field. The caustic line however, does coincide, in two di- 
mensions, with the mode profile.” This geometrical optics method 
has been extended to nonorthogonal resonators incorporating homo- 
geneous media by Popov,® who gave an expression for the resonant 
frequencies. Within the paraxial approximation, exact solutions for 
the field can be obtained from the Huygens principle; for that reason, 
the geometrical optics method, in spite of its general interest, will 
not be discussed further in this paper. 

For the case of resonators incorporating inhomogeneous media, 
the Huygens principle must be supplemented by a quasi-geometri- 
cal optics approximation. This approximation consists of assuming 
that a point source at the input plane of the system creates at the 
output plane a field which can be adequately represented by the 
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roan 3—-Measured distribution functions for noise generators with Z = 11 and 
=13. 


in the simple multiplicative congruence generators. It is expected that 
this result would be true for any of the generators with reasonably 
long periods. 

It was stated earlier that it is possible to generate longer random 
numbers than 25 bits by interleaving bits of shorter generators. Be- 
sides the prerequisite that the periods of the individual generators have 
few or no prime factors in common, it is important that the outputs 
of the individual noise generators be uncorrelated. To check whether 
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Tig. 4—Cross-correlation function of noise generators with L = 11 and L = 13. 
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or not the individual outputs of the noise generators, for the desirable 
values of L, were uncorrelated, we again used standard statistical tech- 
niques to measure the cross correlation function. Figure 4 shows the 
cross-correlation function, Rz,(n), for the case where one input was 
the output of the generator with L = 11, and the other input was the 
output of the generator with L = 13. The cross-correlation function is 
plotted for delays up to 512 samples, and is again based on approxi- 
mately N = 15,000 samples. There are no apparent irregularities seen 
_ in this figure, and the peak values of the cross-correlation function, 
+0.027, are quite close to similar peaks observed for the autocorrela- 
tion function of Fig. 2, and again consistent with the assumption that 
these 512 estimates are from a normal population with « = 0.008. 

The reader should be cautioned that a poor choice of the rotation P 
can cause an ordering in the pseudorandom outputs which may be 
harmful for some applications. For example, consider the set of all 
triples (2p, V1, Un+2) When P = 1, If the sign bits (using two’s com- 
plement integer notation) of x, and 7.1 are the same, the most signif- 
icant bit of ¢n.2 will always be zero. Thus, if (2m, Yn+1) lies in quadrants 
1 or 3, Ynsg 18 constrained to either 


0= Unw < a 
or 
—2Qi-1 = Uni < —2i, 


A similar constraint results when (%,, %n+1) lies in quadrants 2 or 4. 
This effect is eliminated by choosing P ~ L/2, but mutually prime to 
L. We have not considered what ordering might exist in quadruples, 
quintuples, etc., for various choices of P. 


IV. CONCLUSION 


In conclusion, we have presented a fast and efficient technique for 
generating digital random numbers. The simple statistical tests to 
which we have subjected several of these noise generators indicate they 
are more than adequate for use in simulation programs for communi- 
cations systems. 
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geometrical optics field. This approximation is generally applicable 
to optical waveguides and resonators if one disregards the effect of 
apertures and assumes that no diffraction gratings or other wave- 
length-dependent scatterers are present. This quasi-geometrical optics 
method provides an integral transformation for the field which is 
equivalent to a partial differential equation of the parabolic type 
(see Section II). The similarity between this parabolic equation and 
the Schroedinger equation has often been pointed out.°** The 
matched modes of propagation in uniform lens-like media with 
hyperbolic secant refractive index laws, for instance, can be found 
in Landau and Lifshits’ Quantum Mechanics** [whereas the ray 
trajectories are given in Ref. (1), p. 179]. The more general prob- 
lem of unmatched beams in nonuniform lens-like media corresponds 
to the time-dependent Schroedinger equation with time-varying po- 
tentials. The adiabatic approximation usually applied to this problem, 
is based on conditions? which are too stringent for most optical 
systems. Generalized modes, where allowance is made for a wave- 
front curvature, were introduced by Goubau and Schwering’® and 
Pierce’® for the free-space case, in agreement with the theory of 
confocal resonators proposed by Boyd and Gordon.1? These results 
were extended to orthogonal square law media.1®?*?° The transforma- 
tion of the complex curvature of beams through arbitrary optical 
systems with rotational symmetry and the resonant frequency of 
linear cavities was obtained by Kogelnik.**?2 Vlasov and Talanov”* 
have observed that, in two dimensions, the phase shift experienced by 
a matched beam in an optical system is equal to the phase of one 
of the two ray-matrix eigenvalues. This result is easily demonstrated 
and generalized to astigmatic orthogonal systems by using a complex 
ray pencil concept.* 4 

The generalization to nonorthogonal systems is substantially more 
intricate. Arnaud and Kogelnik?* have obtained a generalized gaus- 
sian mode of propagation in free space by giving complex values 
to the three parameters which define an astigmatic ray pencil, 1e., 
the position of the focal lines and the angular orientation of one of 
them. This solution can be used to obtain the beam transformation 
in a sequence of thin astigmatic lenses arbitrarily oriented, by 
matching the complex wavefronts at each lens. This method does not 
give, however, a general expression for the phase shift experienced 
by the beam, knowledge of which is essential in studying resonators. 
For that reason, a somewhat different approach is used here, where 
the ray pencil is defined by two of its rays. The field of the funda- 
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mental mode of propagation is obtained (Section III) by allowing 
these two rays to assume complex positions while remaining solu- 
tions of the ray equations. 

The higher order modes of propagation are studied in Section IV. 
They are obtained by application of differential operators related to 
those used in quantum mechanics. An oblique coordinate system 
is introduced which diagonalizes the complex wavefront of the funda- 
mental mode. In this oblique coordinate system, the higher order 
modes can be expressed as the product of the fundamental solution 
and finite series of Hermite polynomials with real arguments. An 
alternative procedure is also given which leads to Hermite polynomials 
in two complex variables. The simple formula for the resonant fre- 
quencies of linear resonators given by Popov®?* is shown to be ap- 
plicable to ring type resonators incorporating inhomogeneous media 
(Section V). Finally these general results are applied to a new type 
of optical resonator called “cavity with image rotation” which pre- 
sents interesting resonance and polarization properties (Section VI). 
Numerical results are presented. 

The present theory is limited to paraxial first-order solutions in 
loss-less isotropic media. As indicated before, it is assumed that no 
apertures or diffraction gratings are present in the system, and the 
problem of mode selection is not discussed. The electromagnetic field 
is treated as a scalar quantity and the polarization effects are in- 
troduced only at a later stage; this is permissible within the paraxial 
approximation. Fresnel reflection at surfaces of discontinuity is also 
neglected. 


II, PARABOLIC WAVE EQUATION AND INTEGRAL TRANSFORMATION OF THE 
FIELD 


In this section an approximate form of the scalar Helmholtz equa- 
tion is derived which is applicable to paraxial beams, ie., to beams 
propagating at small angles with respect to the system axis. It is 
subsequently compared to an integral transformation derived from 
Huygens principle. 

The scalar Helmholtz equation can be written in a 2x, , v2, 2 rectang- 
ular coordinate system 








OE OE . vE Oe 
aa + aa + ae + kin"(a, , 2% ,2H = 0, (1) 


where E is a component of the field and n(x, , x2, 2) the refractive 
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index of the medium. Let us introduce a reduced field 


(x, , 2 , 2) = E@, , 22 , 2) exp it [ n@, 0, 2) | ; (2) 


and neglect the second derivative of y with respect to z. This approxi- 
mation physically means that only waves propagating in a direction 
close to the z axis are considered. Denoting (0, 0, z) by no , for brev- 
ity, one obtains 


a°y 


Ox; 


a oy te 








+ k(n? — nop = 0. (3) 








ey 





This sanen: can be simplified if one introduces the following 
changes of function and variables** 


Vv = nip, (4) 
(= dz/N (5) 

One obtains 
a +5 aur aoe ojk & ae oe hn? — nyt = (6) 


Let us further assume that n* — nj is a quadratic form in x, , 2 
2 2 2 2 
NM = NH Nie + WMy2%M Le + Neo « (7a) 


N11» Nz ANd Noe are veal functions of z since the losses in the medium 

are neglected. The quadratic form given in equation (7a) describes a 

nonorthogonal optical system when the directions of its axes change as z 

varies. In that case, the diagonal term 2n,.7,72. cannot be eliminated by 

rotating the coordinate system about z. We discuss this general case. 
Let us rewrite equation (7a), for brevity, in matricial form 


n= n2 + For (7b) 


where r denotes a column matrix with elements 2, , x2 and 7 denotes a 
2 X 2 real symmetrical matrix. The sign ~ indicates a transposition. 
Inserting equation (7b) in equation (6), the wave equation assumes 
the form 


eve= (vy ~ 2k cf "iar =. (8) 


where Y* denotes the laplacian operator in the transverse 2, , 2 plane, 
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It is henceforth assumed that n’? — n2 is small compared with unity. 
Within this (first-order) approximation, the refractive index law, equa- 
tion (7b), becomes 


N= No + Fyr/(2no). (9) 


Let us now consider the ray trajectories. A ray ® is defined at any 
transverse plane z by its position q(z) and by the projection p(z) on 
that plane of a vector directed along the ray, of length equal to the 
refractive index n. g(z) and p(z) are called respectively the position 
vector and the direction vector of the ray. It is convenient to represent 
these vectors by column matrices whose elements are the vector 
components on x, , x2 . As long as only fixed coordinate systems are 
used, such matrices can be denoted without ambiguity q(z) and p(z), 
or simply g and p. The exact ray equations are (see, for instance, Ref. 1, 
p. 90) 


p = —nVH(r, p), (10a) 
qG NoV pH (r, p) , (10b) 


at r = q. In equation (10) the upper dots denote differentiations with 
respect to ¢, and H(r, p) denotes the Hamiltonian of the system de- 
fined by 


H(r, p) = —(n? — pp)’; (10c) 


V denotes the gradient operator in the transverse 2, , v2 plane, and V, 
denotes a gradient operator relative to the p variables. Within the first 
order approximation [equation (9)], equations (10c), (10a) and (10b) 
reduce respectively to 


H(r, p) = —N% — (Fnr — Pp)/(2No), (11c) 
p= 79, (11a) 

and 
G=p. (11b) 


Equations (1la) and (11b) are called the paraxial ray equations. 

Let us now consider two arbitrary rays, ® and @®, defined by their 
position and direction vectors g, p and q, #, respectively, and let the 
“product” of these two rays be defined by the scalar expression 

(®; R) = GD — dp. (12) 
(R; R) is sometimes called the Lagrange invariant (see Ref. 1, p. 251). 
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It is easy to show that this quantity is independent of ¢ (or z). Indeed, 
applying equations (11a) and (11b) to both ® and &, and remembering 
that 7 is a symmetric matrix, one obtainst 


d ce ee eS > ee 
ge 85%) = Ge Dp) = G@+eb-@M-@m=d0. — (13) 


The Lagrange invariant (®; ®) plays an important role in the present 
theory. Notice that m) does not appear explicitly in equations (8), 
(11a) and (11b). It can therefore be assumed, without loss of generality, 
that no = 1. 

The properties of propagating beams are sometimes more easily 
understood by considering the transformation of the field between the 
input plane and the output plane of an optical system described by its 
point characteristic. Let us now choose as optical axis, for generality, 
an arbitrary ray @ which need not be a straight line nor even a plane 
curve. Let us further define, at a distance 2’ from an origin 0, a rec- 
tangular coordinate system x! , 73 , whose axes are oriented respectively 
along the principal normal and the binormal to @ (see Fig. 1). At any 
given transverse plane, a ray is defined by its position vector q and its 
direction vector p. Let us assume that there is one ray, and only one 
ray which goes from a point r at z = O (input plane) to a point 7’ at 
z = 2’ (output plane). This assumption implies, in particular, that the 
planes zg = 0 and z = 2’ are not conjugate. The optical length U(r, r’) 
of such a ray is called the point characteristic of the optical system. 
As is well known, the direction vectors of a ray can be obtained from U 
by differentiation (Ref. 1, p. 97) 


p= -VUr,r), (14a) 
p= V'0G,1), (14b) 
at r = q, 7’ = q’. The primes always denote quantities at the output 


plane z = 2’. 


The law of transformation of the field can be obtained from the 
Huygens principle supplemented by a quasi-geometrical optics approxi- 
mation.”* The Huygens principle states that each point of an incident 
wavefront can be considered as the source of a secondary wave. The 
quasi-geometrical optics approximation consists of assuming that the 
field created at the output plane of the system by a point source at 
the input plane is adequately represented by the geometrical optics 


t Recall also that, for any conformable matrices a and b, (ab)~ = ba@ and that, for 
any scalar (one element matrix) c, we have é = c. 
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Fig. 1—Optical axis of a ring type resonator. U denotes the point characteristic 
of the system included between two transverse planes, z = 0 and z = 2’. 


field. These two assumptions allow us to express the field E’(r’) at the 
output plane as a function of the field £(r) at the input plane. Within 
the paraxial approximation, we have 


+0 
E'(r') = +N" i E(*)K(r, r’) d’r, (15a) 
where 
K(r, r’) = | &0/dx; dx} |? exp [—jk0(r, 7’)). (15b) 
The term | 0°U/dz; dx/ |?, where the bars denote a determinant, is 
obtained by recognizing that the power flowing through a small area 
at the output plane is equal to the power flowing in the corresponding 
cone of rays leaving the point source at the input plane, and using 
equation (14a). 
To first order, the quantity S$ = U0 — 2’ is a quadratic form in 7, , 
XL_ , x} , xf which can be written, in matricial notation 


§ = 1GUr + FV + FV + Wr’), (16) 


where U and W are 2 X 2 symmetric real matrices and V is a2 X 2 
real matrix. Equation (16) can be rewritten, more concisely 


$= are re |= see} (17) 
V Wier 7! 


Introducing equation (16) in equations (14a) and (14b), one obtains 
linear relations between p, p’ and q, q’ in the form 
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oe 


It is sometimes convenient to introduce a ray matrix which relates 
q’, p’ to q, p. Simple relations exist between [S] and the ray matrix; 
they are given in Appendix A. 

Let us now go back to the integral transformation and observe that, 
if S is a quadratic form [equation (16)], the determinant 


| o°0/dx,; dx! | = | 0°S/dx; dx} | = | V | (19) 


is independent of r and r’. This term can consequently be taken out of 
the integral in equation (15). The integral transformation of the re- 
duced field y [¥ = E exp (jkz)] becomes 


vor) = 2x1 VP [fe exp (-iks) dr, 0 


whose kernel is essentially 
K, = | V [| exp (—jks). (21) 


Let us show that, in a rectangular coordinate system, Ky represents 
the Green function of the parabolic wave equation, equation (8), i.e., 
that 


LK, = (ve = 2jk 7 + iar’) [| V |}? exp (—jks)] = 0. (22) 


The first term in equation (22) can be written, using equation (16) 
VI V exp (—Jks)] 

| V |? (—gkV7S — k?V'8-7'8) exp (—jks) 

= | V |? exp (—jks)(—jk Spur W — #?V’8-V’S). (28) 


To evaluate the second term in equation (22) one needs to know the 
derivative of $ with respect to 2’. We have (Ref. 1, p. 97) 


Sy = HO", p!) 1 + Gar! — v'p)/2, (24) 
where the paraxial approximation of H, equation (11lc), has been used. 
Therefore, introducing the expression for p’, equation (14b) in equation 
(24), one obtains 

Os ov 


ay = ag 7 = a'r’ — V'S-V'8)/2. (25) 
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One also needs to know the derivative of V with respect to z’. It is 
obtained by introducing the quadratic form, equation (16), in both 
sides of aan (25) 


oor + oe aoe! pa De j 


= Fy! — FV + W)(Vr + Wr’), (26) 
Equation (26) shows, upon identification, that 


dV 
2 AE 13 
dz’ YW. (27) 


Therefore (see Ref. 29) 





dz’ 
= —3|V | Spur W. (28) 


d 1 1d _aV 
Ziveasiv Si vi=3)v 2 spe (v7) 


Upon substitution of equations (23), (25) and (28), one finds that 
equation (22) is satisfied. 

Consequently, within the first-order approximation, one may use 
indifferently the parabolic wave equation, equation (8), or the integral 
transformation, equation (20). Most of the demonstrations given in 
the following sections are based on both formulations. 


Ill. FUNDAMENTAL MODE. OF PROPAGATION 


We know that in the high frequency limit, propagating beams 
closely resemble ray pencils. Let us therefore consider first the field 
of such ray pencils, and subsequently see how this solution can be 
generalized to take into account diffraction effects. 

A ray pencil is, in general, astigmatic; it can be defined, in free 
space, as the manifold of rays which intersect two mutually perpen- 
dicular focal lines. At any point, a surface exists, called the wave- 
front, which is perpendicular to all of these rays. The field of ray 
pencils propagating in inhomogeneous media can be written in a 
%1, V2, 2 rectangular coordinate system 


E(ay, > V2 5 2) = Ag. (29) 


where A and S are real functions of x, , %, and z. A is an amplitude 
factor and S is called the eikonal of the geometrical optics field. The 
surfaces S = constant are the equations of the wavefronts associated 
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with the manifold of rays. Let us assume that one of the rays coin- 
cide with the z-axis and that the refractive index of the medium is 
unity on that axis. Within the first-order approximation, ® = S — z 
is a quadratic form in the transverse variables v7, and 22, whose co- 
efficients are slowly varying functions of z, and A is independent of 
21, %. & can be written, in matrix notation 


P(r, 2) = afFu(z)r, (30) 


where u(z) is a2 X 2 symmetrical matrix which generally depends on z. 
The law of conservation of power dictates that A and yp cannot be 
independent; a wavefront with a positive curvature, for instance, corre- 
sponds to a contraction of the ray pencil as z increases, which necessarily 
results in an increased intensity. To express this relation between A 
and p (transport equation), let-us choose any two rays of the ray pencil 
such as ® and @. Since ® and ® are both perpendicular to the wave- 
front, one has, from equation (30) 


p= V8q) = ug, (31a) 
p= VE@ = ud, (31b) 


where V denotes as before the gradient operator in the x, , x, plane. 
Equations (81a and b) can be written more concisely: 


P = nQ, (32a) 
where we have defined 

2 = [¢ 4, (32b) 

f= Ip pl (32¢) 


Equations (31) and (32) show that the product of ® and ®, defined in 
equation (12), is equal to zero at any plane 


(R; R) = GP — Gp = 0. (33) 


Any ray defined by a linear combination of ® and ® also belongs to the 
ray pencil since its product with either ® or ® is equal to zero. There- 
fore, the one-parameter manifold of rays eR, «eR + R, eR, R + eR, 
with 0 < ¢« < 1, defines a tube of rays in the ray pencil whose cross 
section is a parallelogram with sides eg, eg + 4, and q + e@ (see 
Fig. 2). The area of this parallelogram is given by the length of the 
vector product of g and ¢ 


h= G92 _ Goi = | Q ls (34) 
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Fig. 2—An astigmatic ray pencil is defined in free space by the manifold of rays 
which intersect two mutually perpendicular focal lines such as F; and F2. At any 
transverse plane the intensity of the field is inversely proportional to the square 
root of the area defined by g and 4d, the position vectors of any two rays of the ray 


pencil (® and @). 


Conservation of power requires that A(z)h(z) be a constant. A(z) can 
therefore be obtained from equation (34). Notice that, at a focal line, 
the sign of h(z) changes from positive to negative. Therefore A(z) « 
[h(z)|"? becomes imaginary. If one insists on keeping A(z) real, a 1/2 
phase shift must be subtracted from S at such points (anomalous 
phase shift). 

The elements of the wavefront matrix » can also be obtained from 
the components of two rays satisfying equation (83). One obtains, 
solving for » equation (32a) 


Hn = (Gopi — qopi)h™, (35a) 
Hoe = (ibe — Gips)h™, (35b) 
Hie = He = (ip. — Gpyh™’, 

= (Gop2 — qopa)h. (35¢) 


The reduced field of the ray pencil is therefore 
Wr, 2;R, @) = +h" exp (i © sur) (36) 


where h and p are given by equations (34) and (35a, b and ¢c) respectively. 
The sign ambiguity in the expression of y can be resolved only by 
counting the number of focal lines along the ray pencil, from some 
reference plane. 
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Let us now show that the field of a ray pencil, as given by equation 
(36), is a solution of the parabolic wave equation, equation (8), i.e., that 


ELEVv(r, 2,8, ® = (vy: — Qik <. + nr) 


. & exp (- : run) = 0. (37) 


The first term on the right side of equation (37) is 


vn exp (-i Xu) 


= h* exp (-; * sur)  (—jk Spur pw — Fp’). (88) 


The second term is 


. oO ~2 k 
— Oak — 2 —j,—F7 
29k ae E exp ( 15 ur) 


= h7 exp (—1 tur) X (Gjkh/h — kFpr). (39) 


Using now equations (34), (82a) and (11b), one notices that 
h/h = Spur pu. (40) 


Differentiating both sides of equations (8la and b) with respect to z 
and. using the paraxial ray equations [equations (1la) and (11b)] one ob- 
tains 


(atu — nq = 0, (41a) 
(a+ yp? — ng = 0. (41b) 


Since q and @ are generally linearly independent, it results from equa- 
tion (41) that 


ote =. (42) 


Upon substitution of equations (88), (89), (40) and (42) in equation 
(37), one finds that the field of a paraxial ray pencil is, as expected, a 
solution of the parabolic wave equation. 

It is important to remark that it has nowhere been specified that 
q, p; ¢ and p are real quantities. The right side of equation (36) therefore 
remains a solution of the wave equation if ® and @ are allowed to be 
complex valued while remaining solutions of the paraxial ray equations. 
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In that case, u(z), whose elements are given by equations (35a, b and c), 
becomes a complex matrix and the exponential term in equation (36) 
describes the intensity pattern of the beam as well as its wavefront. 
As observed before’’ the axes of the constant intensity cllipse do not 
coincide, in general, with the axes of the wavefront surface. It is possible, 
however, to define at any plane an oblique coordinate system in which 
both the real part of 4, corresponding to the beam wavefront, and the 
imaginary part of uw, corresponding to the beam intensity, are diagonal. 
This coordinate transformation is given at the end of this section and 
used in Section IV to express in a convenient form the higher order 
modes of propagation. 

h(z), given by equation (34), and therefore the amplitude term A(z), 
become complex quantities too. The +7 ambiguity pointed out for the 
case of ray pencils does not exist any more since the phase of A(z) 
changes in a continuous manner along the 2 axis. 

Let us now consider an optical system described by its point charac- 
teristic matrix [S$] and calculate the transformation experienced by an 
incident gaussian beam whose reduced field has the form given in 
equation (36). Introducing this expression in equation (20), one obtains 
a reduced field at the output plane 


Wr) = EN7 [VPRO 
ff exp ‘ik FU + wr + 2FVr’ + every} d’r. (48) 


The integral in equation (48) is easily integrated if one notices that, 
for any nonsingular square matrix m and any comformable column 
matrices r and s one has 


Fmr + 2s = (F + sm™*)m(r + m7's) — Sm7's. (44) 
Using equation (44) one finds that, if m is a symmetric matrix 


if exp [—(#mr + 27s)] dr = x | m |"? exp @m"'s), (45) 


provided the integral is defined, i.e., provided: 7 (real part of m) r is a 
positive definite form. Substituting 


I 


jg +0) (46a) 


and 


oe is Vr' (46b) 
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in equation (45), one obtains a reduced field at the output plane 


Wr’) = +(—-h |U +4 || VIF 
esp {—i Et — PU + Vb} (47) 


This field has the same general form as the input field and describes 
a gaussian beam with a wavefront matrix 


=W-—V(U +2), (48a) 
or, in terms of the ray matrix (see Appendix A) 
= (C+ Dp)(A + Bu)”. (48b) 


This interesting relationt generalizes the ‘ABCD law” which describes 
the transformation of the complex wavefront in two dimensions.”'” 
In some applications, it is also of interest to know the phase shift 
experienced by the beam through the optical system. It is given, from 
equation (47), to within II, by the simple expression 


= kz’ — 3 Phase of (| U + 4||V 7’), (49) 


where kz’ is the geometrical optics phase shift. Equation (49) reduces 
to the expression given in Ref. 24 in the case of systems with rotational 
symmetry. One also verifies, after a few rearrangements, that the 
amplitude of the beam at the output plane assumes the form given 
in equation (34), ie., that 


W=-h|U+e{| Vl = te - ad, (50) 


where q’ and @ denote the output (complex) ray position vectors. 
Equations (48), (49) and (50) completely define the transformation of 
fundamental gaussian beams propagating along the axis of nonor- 
thogonal optical systems. 

These solutions are easily generalized to the case where the axis of 
the incident beam is a ray ®(q, p), distinct from the system axis. Let 
¥(r, 2) denote the field of an arbitrary beam and @ denote an arbitrary 
ray; one can show” that 


W(r, 2,0) = vr — g, 2) exp [—jh(Fp — 4Gp)] (51) 
is a solution of the parabolic wave equation, equation (8). An equivalent 


+ The transformation of the complex curvature of gaussian beams through non- 
orthogonal systems has been given before (in a very complicated form) by 
Y. Suematsu and H. Fukinuki.3 Equation (48b) ean alternatively be obtained 
without integration by writing down the laws of transformation of (real) astig- 
matic ray pencils, as suggested in Ref. 25. 
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result can alternatively be obtained from the integral transformation, 
equation (20), by introducing a change of variables. According to 
equation (51), a general form for the propagation of gaussian beams is 
obtained by introducing the expression for the field obtained before 
[equation (36)] into equation (51) 


¥(r, 2; 8, ) = ht exp eae If = Que — @) + Bhp - ipl: (52) 


Notice that g and need not be real for the right side of equation (52) 
to satisfy the parabolic wave equation. It is merely required that they 
satisfy the paraxial ray equations. When ® assumes complex values, 
however, it cannot be interpreted any longer as a beam axis. Such 
solutions, with ® complex, are of interest to generate higher order modes 
of propagation, as shown in the next section. 

Let us now show that the fundamental mode of propagation can be 
written in a form resembling the form obtained in the case of orthogonal 
systems. This can be done by introducing, at each transverse plane, a 
coordinate system in which yu is diagonal. 

The reduced eikonal © can be written 


& = giur = Z(fu'r + jru'r), (53) 


where »’ and pv’ denote the real and imaginary parts of u, respectively. 
Two quadratic forms such as fur and fz'r can be simultaneously 
diagonalized if a proper (generally oblique) coordinate system is intro- 
duced.** The explicit expression for this transformation is not necessary 
here, because we are interested only in the general form of the field; 
it is given in Appendix C. To deal with oblique coordinates, it is con- 
venient to introduce a tensorial notation. The expression for the scalar 
product of two real vectorst g and p in oblique coordinates assumes the 
form 


qp= Di, (54) 


where q’ and q’ denote the (contravariant) components of g, obtained 
by drawing lines parallel to the axes from the tip of the vector q as 
shown in Fig. (3), and where p, , p2 denote the (covariant) components 
of p, obtained by drawing lines perpendicular to the axes. For brevity 
the summation sign over repeated indices is omitted. 


+The following relations are also applicable to complex vectors since such 
— can be defined as linear combinations V, + jV. of two real vectors V- 
and V 
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xy 


~~ PHASE = CONSTANT 


~ 
™ INTENSITY = CONSTANT 


Fig. 3—This figure represents the oblique coordinate system, defined in the 2122 
transverse plane, which diagonalizes both the real and the imaginary parts of the 
wavefront (represented schematically by ellipses). The contravariant components 
of the position vector g, and the covariant components of the direction cosine 
vector p are also represented. It is assumed that the unit vectors of the coordinate 
system have a unit length. The index in the rectangular coordinate system is placed 
at a lower position only to distinguish it from the oblique coordinate system. 


The reduced eikonal @ is now written 
$= dy, jun? (55) 


where x’, 7 = 1, 2 (or x’, 7 = 1, 2) denote the contravariant components 
of a position vector 7, and w denotes a twice covariant tensor. With this 
notation, equations (31) and (82) are valid in any coordinate system. 
Therefore, in the coordinate system in which yp is diagonal (4,2. = 0), 
we have 


y= Mud (56a) 
Pe = Moo’ 5 (56b) 
A= Haid ; (56¢) 
po = Moog - ood) 
Let us set 
ig PE eC: OI ap, (57a) 
q q 
ee a = ? = CO, — 27k w;?. (57b) 
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In equation (57), C; and —2k7’w7? denote the real and imaginary 
parts of u;; , respectively. The reason for the notation 2k” *w7? is that 
w; represent the beam radii along the coordinate axes, as shown in 
equation (59). 

In the new coordinate system (with base vectors of unit lengths), 
the area of the parallelogram constructed on the vectors g and ¢ is 


sny (¢ — ¢¢@) (58) 


where y is the angle between the two coordinate axes. The field of the 
fundamental mode, equation (86), can consequently be rewritten 


Pool’, a ®; R, GR) = (sin Nig ¢ a gq)? 
-exp {—[(c'/w,)* + (v"/wr)"]} 


exp {iE (cay + eI} 9) 


The first exponential term in equation (59) describes the beam in- 
tensity pattern and the second one describes the wavefront of the beam. 

In the special case where the lens-like medium is orthogonal, one 
may choose ® and ® in two mutually orthogonal planes. Assuming that 
these planes coincide with the x,z and 222 planes, respectively, we have 


> p2 = Ge > Pi S U, (60) 


and equation (59) reduces to the known form (see, for example, Ref. 24) 


a =f wk 
Wesley ’ Ve 92; RK, ®) = qi” exp Ee = J5 cxat | 


ae 9 . h 
*Go* exp | —(ex/)" Say 5 cua! | 


(61) 


IV. HIGHER ORDER MODES OF PROPAGATION 


Two procedures are given in this section to obtain the higher order 
modes of propagation. One is based on the power series expansion of the 
field of off-set gaussian beams and the other is based on the application 
of differential operators on the fundamental mode. These two methods 
can be shown to be equivalent. They lead however to two different 
representations of the field, one in terms of Hermite polynomials in 
two complex variables and the other in terms of finite series of ordinary 
Hermite polynomials. Both representations are of interest. 

It has been shown in the previous section that the field of a gaussian 
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beam propagating along the axis of an optical system is fully described 
by two complex rays, denoted ® and ®&, satisfying equation (33). This 
solution of the wave equation can be generalized to include the case 
where the beam axis is a ray ®. It was pointed out also that @ may 
assume complex values provided its position and direction vectors 
(G, ) remain solutions of the ray equations. Let us define ® as a linear 
combination of the rays R* and ®*, conjugate of R and @, respectively. 
We have 


I 


G = ag" + ang Q*a, (62) 
Dp = ap* + ap* = P*a, (63) 


where a, and a, are two arbitrary parameters. Introducing these ex- 
pressions in equation (52) one obtains 


v(7, 2; R, R; 1; Qo) > Pool’, 2; R, R) xX exp (ay _ Zava), (64) 


where Yoo(r, 2; R, R) denotes the fundamental mode field and where we 
have defined 


a= *, . (65a) 
LQ@2 

—— Vii | _ —2kQ*u'Q*, (65b) 
LV¥i2 Voae 

y= . = —2Q*y'r. (65c¢) 
| Yo 





Notice that v is a symmetric matrix as a result of equation (83). 
One now observes that the exponential term in equation (64) is the 
generating function for Hermite polynomials in two variables” 


~ —_ i ~ —— —= 
exp (ay xava) = earl a i 2H Y; v), (66) 


where the polynomials H,,,, have the form 


mi n 1 m m = I m— n 
Ve Ome Creer PSH ie imma), ui "Ys 


mn Heat ye lam — I icin “ 
5 ee ie Viel), "Y> : Se 9 awe, 1 2 ai | eae (67) 
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and, form +n S 3 


Hoy = 41, 

Ay =, 

Ho = Ye ; 

Hoy = Yi — Mun y 

Ay = WiY2— Ye; (68) 
Hz = Y2 — Yaa 


Hy = Yi — Bn 5 
Hey = YiY2 — War — MurYo 
Hz = YrY2 — 2Wy2Y2 — vest 
Hos = Y2 — 3vo2Ye « 


Each coefficient in the expansion of W(r, 2; ®, Rj a: , a2) in power 
series of a, , a2 is necessarily a solution of the wave equation since a, 
and a, are arbitrary numbers. New solutions of the wave equation are 
therefore obtained in the form 


Vinal 2) R, R) — Yool?, 2; R, RB) H nn (Q** ") v). (69) 


It is demonstrated in Appendix D that this set of solutions forms an 
orthogonal system, provided the condition (R; R*) = 0 is satisfied 
[in addition to equation (33)]. The fact that y, and y2 are complex does 
not raise any particular difficulty in calculating H,,,(y: , Ye ; v) from 
equations (67) and (68). This prevents us, however, from identifying 
y, and y, with real coordinates. It is important to notice that multiplica- 
tion of ® by a factor \ (ie, g — Aq, p — Ap) and & by a factor 
\ (¢ — 4G, 6 — KM) leaves essentially unchanged the field given in 
equation (69); it is merely multiplied by a constant. This property 
results from equation (65) and the general form of H,, given in equa- 
tion (67). Consequently ® and ® need to be defined only to within 
constant factors. 

In the special case where the optical system is orthogonal, one may 
choose ® and ® in two mutually perpendicular meridional planes co- 
incident with the 2,2 and 2.2 planes respectively (¢@2 = p. = O, 
gd. = p, = 0). The matrix v becomes diagonal and the Hermite poly- 
nomials in two variables reduce to a product of two Hermite poly- 
nomials in one variable, To within a constant one has, in that special 
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case” 
HnlYr s YoY) = v0 A (2 yw DH (2 Pyvs}) (70) 


where H,(x) denotes a Hermite polynomial in one variable of order k 
(as defined in Ref. 33). Using equation (57), the right side of equation 
(70) can be written, to within a constant 


(g/q1)"(G5/4a)" OH (2401 /w,)H 9(2422/w2), (71) 


in agreement with previous results.”’ 

The procedure just described for obtaining new solutions of the 
wave equation can be applied to an arbitrary field y(r, z). The coefficients 
of the power series expansion are obtained in that case by repeated 
differentiation. If one calculates the coefficients for the few first orders, 
one finds that they assume the form 


Vn(T; 2 Ry R) = A™(A*)A(R*) VC, 2), (72) 

where A(R) and A(R) are differential operators defined by 
A(Q) = pr — jk GV, (73a) 
A(R) = pr — jk OV. (73b) 


It is not difficult to show, using equation (33), that these two operators 
commute with one another. For generality, let us demonstrate equa- 
tion (72) on the basis of the integral transformation, equation (20).* 

Let y(r) and y’(r’) denote fields at the input and output planes, 
respectively, of an optical system described by its reduced point charac- 
teristic S$. Let us prove that a field A(®)y¥(r) is transformed into 
A’(Q@’)y'(r’) at the output plane, i.e., that 


(pr? — ima ff V(r) exp (—jkS) wr} 


= [far — 70) 90) exp (—H88) a'r. 74) 


Notice that the constant term + 7*|V |? in equation (20) can be 
dropped. The primes in equation (74) refer as before to quantities taken 
at the output plane. 

Using equation (16), one finds that 


V’ exp (—jk8) = —jk(Vr + Wr’) exp (—jks). (75) 


+ Alternatively one can show that the operator A(®) [or A(@)] commutes with the 
wave equation operator, equation (8). This result has been obtained before by 
Popov” for a special form of A(®). 
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Therefore the left side of equation (74) can be written 
I [p'r’? — G'(Vr + Wr’)]¥(r) exp (—JkS) d’r. (76) 


To evaluate the right side of equation (74), notice that, for any function 
F(x, , %) which tends exponentially to zero as 7, ,%_— + ©, one has 


/ VE C5 Xs) day dirs = 0. (77) 


Therefore, setting 
F(a, , X2) = ¥(r) X exp (— jks) _ (78) 


in equation (77), one obtains 
/ Vwr) X exp (—jks) d’r 


— | il » (ak 8) vO) X exp (—jk8) d’r. (79) 


Using again equation (16) to evaluate VS, the right side in equation (74) 
becomes 


iz [pr + G(Ur + Ve')\¥(7) exp (—Jk8) d’r. (80) 


The identity of the two terms in brackets in equations (76) and (80) 
results from the ray equations, equation (18). 

The property established for A(®) clearly holds true also for the 
operator A" (Gi) corresponding to m applications of A(®), and for the 
operator A"(R) associated with another ray @. 

When applied to a gaussian beam (defined by &, ®), the operators 
A(®) and A(R) give a result identically equal to zero. Higher order 
modes are obtained, however, if one considers the operators associated. 
with the conjugate rays ®*, R*. One therefore calculates 


Valls 23H, R) = AMRYA(A*) Poolr, 258, R). (81) 


To give a convenient form to the right side of equation (81), let us 
write down explicitely in tensorial notation (see Section IIT) the operator 
A(Q*), defined by equation (73a) 


A(Q*) = ptr’ — jk 'q'*V; 
fe) ae 6] 
= *y1 _ “Rolle “_ Kye __ 1 q* . 
= (nts jk"q 2.) + (nz jk a 2;) (82) 


Using equations (82) and (59) and the relation®® 
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-. H,(x) = 22H ,(2) = Ain ar(2), (83) 


where H,,(x) denotes a Hermite polynomial of order k, one finds that 
(GM) {H (2?! /w,)Hy(2'x*/tws) Poole", 2°, 2H, H)} 
= poole", 2°, 238, O)Lq'*/W1H mn +1(24x"/w,) H,(242" / wp) 
+ ?*/ Wel n(2*x! /w,) Hn (2*x?/w,)]. (84) 


A similar relation holds for A(®*). These two relations show, by recur- 
rence, that the field of the mode m, n can be written 


Winn(@, 0, 25K, R) = A™(R*)A"(R*) Woo(z, 27, 2; R, ®) 

Wola", 2°, 2;R, R) X [q'*/wH 22" /w,) 

+ @*/w.H (2'x?/w.)\" X [¢'*/w HT (2*2'/w) 

+ @*/w2H (2*x?/w.))’, (85) 


where the convention is made that, after multiplication of the two 
binomials, H*(x) actually represents a Hermite polynomial of order 
k: H,(x). This form of the field shows that the higher order modes of 
propagation can be obtained by multiplying the fundamental solution 
by a finite series of Hermite polynomials in one real variable.t Since 
q/q and ¢/¢ are generally complex, the wavefronts are different for 
each mode. It is shown in the next section that q’/q’ and ¢°/¢° happen 
to be real, however, at the end mirrors of linear resonators. From this 
observation, it results that the wavefronts of all of the resonating 
modes generally coincide with the end mirror surfaces. 

Another special case of interest is the case where q’/q’ and ¢°/g' are 
both equal to 7. This happens in the case of systems with rotational 
symmetry, such as the “cavities with image rotation” which are investi- 
gated in Section VI. 


V. NONORTHOGONAL RESONATORS 


We are concerned in this section with the resonant fields and the 
resonant frequencics of nonorthogonal resonators. Ring-type resonators 





*In the case where m = 0 (or n = 0) it is not difficult to show that the two 
expressions given for the mode mn in equations (69) and (85), respectively, 
coincide. This result can be obtained by writing equation (69) in the coordinate 
system in which uw (but not necessarily v) is diagonal and using an expansion 
formula [equation (22), p. 371 of Ref. 32] for Hmn and a condensation formula 
{equation (31), p. 345 ‘of Ref. 32] for the right side of equation (85). In the 
general case a direct comparison of the two expressions appears to be difficult, 
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being conceptually simpler than linear resonators, their properties are 
considered first. A ring-type resonator is essentially a section of wave- 
guide closed on itself. An optical beam is a mode of the resonator, if, 
after a round trip, its field reproduces itself exactly. 

The general form of the solutions obtained in the previous sections 
(Sections III and IV) is preserved as the beams propagate through 
an optical system. In general, however, the field distribution at the 
output plane of a section of waveguide does not coincide with the 
field distribution at the input plane [see, for instance, the transforma- 
tion law, equation (48b) for the fundamental mode]. By a proper 
choice of the mode parameters it is possible, however, to achieve 
coincidence between the fields at the two planes (except, perhaps, 
for a constant phase factor). In that case, the beam is said to be 
matched to the section of waveguide considered. Clearly, such a beam 
would also be matched to a sequence of identical sections, forming 
a periodic waveguide. For the fundamental mode, the matching con- 
dition can be obtained by specifying that p’ = yw in equation (48b) 
and solving for ». However it is more convenient to look first for 
rays which reproduce themselves after a round trip in the system 
(except for a constant factor) and calculate the wavefront matrix p 
associated with these rays. Such rays are called eigenrays; they are 
always complex in the case of stable resonators. 

To obtain the eigenrays, let us replace q’ and p’ by Aq and 
Ap, respectively, in equation (18). One obtains the relations 


po AU ate NV )g; (86a) 
p=('V + Wg, (86b) 

and, by addition and subtraction 
0=(U+W+ dV 4+ d'DV)g, (87a) 
p=3(W-U+X"°V — dV)q. (87b) 


Equation (87a) actually represents a system of two homogeneous 
linear equations which admit a solution only if 


|[U+W+ AW 4+r"V | = 0, (88) 
where the bars denote a determinant. Equation (88) can be rewritten 


as a second-degree equation in (A + A) as shown previously?’ for a 
special case. One obtains 


| V | (r a a + [VisKoo + KyVo2 — Kie(Vig + Voy) ]Q + \~*) 
+|K|— Vie— Va)? =0, (89) 
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where we have defined 


kK=U4W. (90) 


The resonator is stable when the solutions of equation (89) for 
(\ + \7")/2 = cos 0 (91) 


are real and are in the range —1 to +1; this is assumed henceforth. 
In that case, two real characteristic angles, denoted 0 and 6, are ob- 
tained, the two other characteristic angles being clearly —@ and —6. 

If one introduces one of the four eigenvalues \ = exp (j#), \ = 
exp (76), * = exp (—j0) or \* = exp (—j6) in equations (87a) and 
(87b), one obtains (to within arbitrary constants) the components of the 
four eigenrays denoted respectively R, R, R* and R*. Let us show that 
the product of ® and ® [defined in equation (12)] is equal to zero. 

Since (@; ®) is invariant, one may choose a reference plane along the 
path where the matrix V is symmetric. At such a plane, equations (87a) 
and (87b) assume the form 


0=(U+W+Aa+)V)q, (92a) 
Pa =U Eee iV 1g: (92b) 
Since both U + W and V are symmetric, one has” 

GVq = GVq = 0, (98) 

provided the absolute values of 6 and 6 are distinct. Therefore 

(R38) = Gp — & 

= 3q1W —-U+ (@"—N)V]G- 34W- U+ 07 — YVIq 
= 207 — NqgVG — 30" — GV q = 0. (94a) 


One also has, replacing \ by \~* and/or \ by 7" 
(A*;R*) = 0; (@;R*) =0; (@*;R) = 0. (94b) 


Therefore, according to the results of Section III, each pair of eigenrays 
in equation (94) defines a gaussian beam. The choice between the four 
pairs of eigenrays can be made by giving either a positive or a negative 
sign to @ and 6. It is made in such a way that the imaginary part of u 
is a negative definite form. This ensures that the power carried by the 
beam in finite. After traversing a period of the optical system, the 
position g and the direction p of ® become g exp (j@) and p exp (j@) 
respectively. Similarly, ¢ and # become @ exp (j6) and # exp (j6). Equa- 
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tions (35a through c) and (34) show that » assumes its original value 
after a period (round trip) in the optical system; h, however, is multi- 
plied by exp [j(@ + 6)]. The field of the fundamental gaussian beam 
defined by ® and ® consequently reproduces itself after a period except 
for an additional phase shift equal to kL — (6 + 6)/2, where L denotes 
the period (round trip) path length. 

To clarify the above discussion let us observe that the modal matrix 


fee | , (940) 
ikP* P 


where Q and P were defined before in equations (82b and ¢c), ts itself a 
ray matrix, 1.e., satisfies equation (112). As shown in Appendix D, the 
imaginary part of » can be written —(k~*/2)(QQ*)~'; this is clearly a 
negative definite form, as required. It can also be shown, using equations 
(111), (16) and (21), that the mode generating function Y(r; a) given in 
equation (64) is precisely the output field created by a point source 
located at the input plane of a (lossy) optical system whose ray matrix 
is the modal matrix, equation (94c). 

Considering now the form, equation (85), obtained for the higher 
order modes, it appears that the operators A” and A” are responsible for 
an increase of the phase of y,,, equal to —mé@ — n6. Therefore, the 
general expression for the resonant frequencies is 


Kemnls = (m + 4)0 + (n + 4)6 + lr, (95) 


where ¢ is an integer defining the number of wavelengths along the 
system axis. This result was obtained by Popov’’”® for the special case 
of linear nonorthogonal resonators incorporating homogeneous internal 
media. It is shown here to be applicable to the general case. 

Let us now investigate the case of linear cavities (cavities with 
folded optical axis). It is convenient to replace the two curved end 
mirrors of such resonators by plane mirrors and thin lenses, and take the 
reference plane at one of the end mirrors. In a round trip along the 
folded optical axis two optical systems are encountered which are 
mirror images of one another. It is shown in Appendix B that the point 
characteristic matrix assumes in that case the simple form 


[s] = i ii (96) 
V U 


where both U and V are real and symmetric. The characteristic equa- 
tions (92a) and (92b) become simply 


(U + cos 6V)q = 0, (97) 


NONORTHOGONAL OPTICAL SYSTEMS 2337 


p = —jsin 6Vq, (98) 


where 6 is the characteristic angle. 

Let g, p and g, # denote two solutions of equations (97) and (98) 
(eigenrays). Equation (97) shows that the ratio of the two components 
of q and 4, q2/q: and q2/q4, , respectively, are real (in any coordinate 
system) since, for a stable resonator, the solutions 6 and 6 of the charac- 
teristic equation 


| U + cos eV | = 0 (99) 


are real. One also observes that the wavefront matrix p is imaginary. 
This result shows that, at the end mirrors of linear resonators, the wave- 
front of all of the modes coincide with the mirror surfaces, except 
perhaps in some cases of degeneracy. Since U and V are symmetrical, 
one further notices that* 


gVq = 0, (100) 


provided the absolute values of 6 and 6 are distinct. Therefore, from 
equation (98), 


Gp = fg* = 0. (101) 


This relation is useful in checking numerical calculations. 


VI. CAVITIES WITH IMAGE ROTATION 


As an example of application of the general theory discussed in the 
previous sections, let us calculate the resonant frequencies and the 
resonant field of a new type of optical cavity that one may call ‘‘cavity 
with image rotation.” 

Consider a nonplanar closed path (see Fig. 4) and let be the rotation 
experienced after a round trip by rays parallel to the optical axis. 
(The value of Q for a given orientation of the mirrors can be found in 
Ref. 4.) The case where the optical system has a rotational symmetry is 
of particular interest. Let [? 5] be the 2 X 2 ray matrix of the optical 
system with rotational symmetry introduced along the path. The round 
trip point characteristic matrix of the resonator is, in rectangular 
coordinates 


a 0 / —eos 2 —sin Q 
{ 
| 
ie iliac ecres i ee (102) 
— cos Q sin Q | d 0 
| 
t 


—sin Q —cosQ | 0 d 
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Fig. 4—A cavity with image rotation is represented. It incorporates a lens and 
four plane mirrors which define a nonplanar path. As a result of the twist of the 
path, this resonator is nonorthogonal. When the lens is astigmatic, the resonating 
modes do not exhibit the same patterns as in the case of more conventional 
cavities. 


Equations (89) and (102) show that the characteristic angles are simply 


6= 4+ Q, (103a) 
6= 6 — Q, (103b) 

where we have defined 
cos 05 = (a + d)/2. (104) 


The resonant frequencies are therefore given, from the general rela- 
tion equation (95), by 


Kemi = (m+n + 1)0 + (m — n+ 1)Q0+4+ 2x. (105) 


The additional term + in equation (105) is to be introduced when 
polarization effects are taken into account. It has been assumed that 
the mirrors are perfect conductors, even in number, and that the medium 
is isotropic. In that case the polarization vector experiences the same 
transformation as an image,” i.e, a rotation Q. The + and — signs in 
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equation (105) refer to the clockwise and counterclockwise polarization 
states, whose degeneracy is therefore lifted. 

The eigenvectors q, p and q, p have respectively clockwise and counter- 
clockwise circular polarizations too, as one expects from the rotational 
symmetry of the system; they are independent of the image rotation 0. 
The components of R(qg, p) and R(g, p) are respectively, to within 
arbitrary constants 


5. ae b) es re —b) (106) 
p(—sin 6 , 7 Sin 4) O(—sin 0 , —7 Sin 6). 


Setting for brevity, 2?z'/w = xz, and 2?x?/w = zx, , where w is the beam 
radius, the mode y,,, assumes in rectangular coordinates, from equa- 
tion (85), the form 


Yon = [H(@) + GH (%2)]"[H (a1) — JH (%2)]"Yoo « (1072) 


This expression, being independent of Q, should coincide with known 
forms (see Ref. 15 or 11) which can be written 


(—1)’n! 2°73" “Le "(ZZ*) Woo if m2n, (107b) 
(—1)"m! 2"**Z*"" "LI "(ZZ*) bono If m Sn, (107c) 

where L; denotes a generalized Laguerre polynomial, and 
Z=2,+ jr. (108) 


A relation between Hermite polynomials and generalized Laguerre 
polynomials was given before, in a different form, by J. R. Pierce and 
S. P. Morgan (private communication). The identity of the right side 
of equation (107a) and equations (107b) and (107c) is easily demon- 
strated for the special cases n (or m) = 0, and m = n, using well- 
known formulas,* and verified for the first values of m, n. The 
field consequently assumes the same form as in ordinary cavities. A 
rotation © about the z axis of the beam pattern can be expressed by 
a multiplication of Z by exp (jQ) and consequently, from equation 
(107), by a phase shift (m — n)Q, in agreement with equation (105). 
The distinctive feature of cavities with image rotation compared with 
ordinary cavities, in addition to the polarization properties mentioned 
before, is that the intensity pattern of the resonant field has neces- 
sarily a circular symmetry. 

When the optical system introduced along the nonplanar path is 


. - pre Ref. 33. Notice that a factor 22" is missing in equation (32), p, 195, of this 
OOK, 
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astigmatic, one must use the general expressions given in the previous 
sections. This case has been studied numerically, using equation (85), 
for the case of a resonator incorporating a single spherical mirror of 
radius R = 6m, operating at an incidence angle of 30° and an odd 
number of plane mirrors. The spherical mirror is equivalent to an 
astigmatic lens of focal lengths f; = 2.6m and fe = 3.47m. Assuming 
a round trip path length L = 1m and an image rotation Q = 20°, 
one obtains for the point characteristic matrices, from equation (115), 


with d = 1, v=0 
is a 0 | 
0 0.712 


he = 
ee 0.94 0.34 | | 
0.34 —0.94] 


7 7 ; | 
0 1 


The characteristic angles are 


6 = —13°3, 
6 = —54°, 
from which the resonant frequencies can be obtained. The components 


of the eigenrays ® and @ are respectively, in a rectangular coordinate 
system 


é fa — 1.35) 

p(0.19 — j0.66, —0.62 — 70.19), 
_ Pe j0.91) 

(0.19 — 0.57, 0.49 + 70.13). 


These two eigenrays fulfill, as expected, the condition ¢p = Gp; they 
define a wavefront matrix 


i be — 50.305 0.0206 | 
0.0206 0.072 — j0.246 


whose imaginary part is a negative definite form, as required. The 
intensity pattern for the mode yoo is shown in Fig. 5. It is inter- 
mediate between the circularly symmetric patterns observed when 
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v2 





RELATIVE 
INTENSITY = 1 


Fig. 5—This figure represents the constant intensity curves of the TEM» mode 
in a nonorthogonal cavity incorporating a 6m radius mirror with an incidence 
angle of 30° and an image rotation of 20°. The optical axis path length is lm, and 
the wavelength is lum. 


fi = fe, and the usual orthogonal patterns observed for Q = 0 (see, 
for instance, the TEM, mode in Fig. 7 of Ref. 11). 


VII. CONCLUSION 


It has been shown that, within the first order approximation, the 
solutions of the scalar wave equation can be expressed in terms of the 
solutions of the (simpler) ray equations. The fundamental mode of 
propagation in nonorthogonal media was obtained by generalizing the 
expression for the field of astigmatic ray-pencils. An oblique coordinate 
system has been introduced which reduces this solution to the form 
assumed by ordinary gaussian beams. The higher order modes of prop- 
agation were also obtained; they can be expressed as the product of 
the fundamental solution and Hermite polynomials in one real variable. 
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The results of Popov®?¢ for the resonant frequencies of nonorthogonal 
resonators were extended to resonators incorporating arbitrary lens-like 
media and were applied to a new type of cavity which exhibits interest- 
ing resonance and polarization properties. This theory may also be 
useful for special optical waveguides such as the helical gas lenses, and 
for analysis of optical systems which are nominally orthogonal, but 
which suffer from small distortions in three dimensions. 


APPENDIX A 


Relations Between the Point Characteristic Matriz and the Ray Matrix 


It has been shown in the main text [equation (18)] that the direc- 
tion vectors p, p’ of a ray at the input and output plane of an optical 
system are related to the position vectors q, q’ by the following matri- 


cial relation 
Ae Dt} 
p’ [ V WwW Lq’ q’ 


where U and W are 2 X 2 real symmetric matrices, V is a2 X 2 real 
matrix. [S$] is a4 X 4 symmetric matrix which has been called the point 
characteristic matrix. One also sometimes defines a ray matrix, [91] 
which relates the position and direction vectors of a ray at the output 
plane to the values assumed at the input plane 


CG Elem 


where A, B, C, D are 2 X 2 real matrices. Since, from equation (109), 
only 10 numbers suffice to define the optical system, the elements of the 
4 X 4 ray matrix [Si] must be related by 16 — 10 = 6 relations. To 
obtain these relations, let us compare equations (109) and (110). One 
obtains readily 


U = BA, (111a) 
V = —B", (111b) 
V = C — DB"A, (111¢) 
W = DB". (111d) 


Since U and W are symmetrical one has 


AB — BA = 0, (112a) 
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BD — DB =0, (112b) 


and, by comparing the expressions obtained for V and V, equations 
(111b) and (111c), and using equation (112b), one finds that 


DA — BC = 1. (112c) 


Equations (112a, b and c) are equivalent to those given by Luneburg.’ 
They effeetively correspond to six independent relations. The relations 
inverse of equations (11la through d) are 


A=-V"U, (113a) 
B=-V", (113b) 
C=V—-WV'l,z (113¢) 
D=~—WV". (113d) 


APPENDIX B 


Point Characteristic Matrix of a Sequence of Thin Lenses and Mirrors- 
Symmetrical Systems 


The point characteristic matrix [8] of a sequence of thin astigmatic 
lenses and plane mirrors, arbitrarily oriented in space, can be obtained 
in closed form. 

Let us first consider a thin astigmatic lens oriented at an angle v 
with respect to the x, axis of a x,7.2 rectangular coordinate system, 
with focal lengths f, , f2 . This lens is followed by a section of free space 
of length d. For generality, one further assumes that the output co- 
ordinate system is rotated by an angle Q about the z axis. This rotation 
has to be introduced in the case of non planar paths.**** Using the 
expression for the optical thickness of a lens, and the paraxial approxi- 
mation of the length of tilted rays in free space, one obtains 


(S] a ’): (114) 
V W 
with 


ae (cos : + sin» ’) cos vy sin (t — | 
ple wh ffs 


cos v sin it = 2) 1 (co8 ane sin” ’) 
fi fe d ho fr 
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_,| cosQ sin Q 
V=-d | | ; 
—sinQ cos Q 


W = a 0}, (115) 
0 1 


These expressions are applicable to curved mirrors under oblique inci- 
dence with little modification since a curved mirror is equivalent to a 
plane mirror and a lens, in the most general case.** It remains to calcu- 
late the point characteristic matrix of a sequence of optical systems 
such as the one described by equations (114) and (115). 

The point characteristic matrix [S,] of a sequence of two optical 
systems whose point characteristic matrices are respectively [S,] and 
[S2] is obtained by using equation (18) of the main text, and specifying 
that the rays are continuous at the junction between the two systems. 
One obtains 


[s i] = ce = ViW, + U2)” V, ae + U.)” V2 | (116) 
—V(W, ss U2)” V; We a V(W, Ge U,)* V2 


In the special case where the second optical system is the mirror 
image of the first system with respect to their common plane, [8] 
reduces to 


—i “1 pa AP 
ne - a = " A ie ain 
v. Ww, —ivw7t U-avwP 


where the index 1 has been omitted. Equation (117) shows that, in 
a symmetric system, U, is equal to W, and V; is a symmetric matrix. 

Repeated applications of equation (116) and equations (114) and 
(115) give the point characteristic matrix of an arbitrary sequence of 
lenses or mirrors. 


APPENDIX C 


Diagonalization of a Complex Wavefront 


The need for introducing an oblique coordinate system at each 
transverse plane has been outlined in the main text. Detailed transforma- 
tion formulas are given in this appendix. 

Let €: , é2 be the base vectors, of unit length, of the original rectangular 
coordinate system, and e, , e, the base vectors, also of unit length, of a 
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new coordinate system.' The e; , 7 = 1, 2 are linearly related to the 
ej,j=1,2by 


e;, = die; ; (118) 
where 6? is the mixed tensor which expresses the coordinate transforma- 


tion. The reduced eikonal ® is a complex quadratic form which was 
written in the original coordinate system [equation (53)] 


® = fur = GF(u + ju')r, (119) 


where yp” and u’ are real symmetric matrices. 

By stipulating that, in the new coordinate system, the off-diagonal 
terms of »” and py’ are both equal to zero, one obtains the transfor- 
mation [6] which diagonalizes ® 


(4) = i i] e Tee a fall (120) 
o 2 2\-4 


53 ul +wy? Gd+wvy3 
where 
u = (c/a = [—b + (b” — 4ac)*]/2a, (121) 
Q = pitts — Miki (122a) 
b = mito — Misbas y (122b) 
C = paolia — Maobic - (122c) 


The law of transformation of the contravariant components of a vector q, 
denoted respectively g° in the old system and q’ in the new system is 
(omitting the summation sign) 


q’ = 650’. (123a) 

This relation is also applicable to the coordinate x’ | 
Se ae (123b) 
The covariant components of a vector p, denoted p; in the old system 
and p; in the new system, transform according to the inverse relation 


Pi = Sip; - . (124) 


Expressions for the new components of u are derived in the main text. 


+ Quantities relative to the new system are denoted by bold face’ letters i in this 
Appendix. Ordinary letters are used in the main text, where there is no risk 
of confusion. 
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APPENDIX D 


Orthogonality of the Modes 


Let q, p and g, p be any two solutions of the paraxial ray equations, 
equations (lla and b), and assume that the matrix PQ™’, where 


Q=([¢4l, (125a) 
P= pp), (125b) 


is symmetric. 
An infinite set of solutions of the parabolic wave equation has been 
obtained in the main text in the form [equation (69)] 


nalts #3 ,P) = 1Q [exp (—s8PQ"'r)Haalxi»), (126) 
where H,,,, denote the Hermite polynomial in two variables x, , x2 
x = Qtr, (127) 
associated with the quadratic form xvx, where 


jkQ*(PQ” — P*Q*™’)Q* 


Ill 


v 


= JQ'Q*, (128) 
* pak An*® — Ko*¥ 

ys a Ba" ap" — Pa | (129) 
apt — gt opt — Be 


Let us now impose on the rays the additional condition 
(8; A*) = ap* — DG = 0 (130) 


and assume that the diagonal terms of J are positive. Since, as pointed 
out in the main text, the two rays need be defined only to within con- 
stants, they can be normalized in such a way that J is the unit matrix. 
In that case we have, from equations (127) and (128) 


yo Sp, (131) 
vx = x". (132) 

Consequently 
Hora (x3 ¥) = Hara (x¥; v*) = Garner (x; ») (133) 


where we have introduced the adjoint polynomials G,,,, defined by 


Gan(xj ¥) = Hnn(vxj v7’). (134) 
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The orthogonality condition for two solutions y,,, and Wm, [equation 
(126)] can now be written in the form 


I = Vinal) Vera (r) dr 


I 


LO7O* ff exp ($x) Hal )Gmrnrlxse) Bx, 


=2rm!n! if m’ =m and n’ =n, 


=) if m’~m or nn #n. (135) 


The biorthogonality property*” of the polynomials H,,, and G,,, has 
been used in equation (135). 
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Optical Resonators With Variable 
Reflectivity Mirrors 


By H. ZUCKER 
(Manuscript received May 27, 1970) 


In this paper we investigate circular optical resonators with gaussian 
profiles of the mirror reflectivities. Closed form solution to the integral 
equations for such resonators are obtained. The dominant TEM, ,. mode 
characteristics of a resonator consisting of one variable reflectivity mirror 
(VRM) and one uniform reflectivity mirror (URM) are considered in 
detail for a variety of parameters. This resonator is particularly suitable 
for high-gain lasers. Its advantages in comparison to the conventional type 
are: (1) there ts larger mode volume utilization, and (it) the power trans- 
matted at the variable reflectivity mirror can in principle be utilized as the 
power output. We discuss dependence of the spot sizes on laser gain and 
marror-curvature tolerances and present a specific design of a Fabry-Perot 
resonator for fundamental mode operation and the expected performance. 


I. INTRODUCTION 


The dominance of the fundamental mode in optical resonators with 
uniform reflectivity mirrors is due to the lowest diffraction loss of this 
mode. The power output of this mode is commonly obtained by using 
a partially transparent mirror. These two features could be combined 
in a resonator consisting of one uniform reflectivity mirror (URM) 
and one variable reflectivity mirror (VRM). 

Resonators with VRM were investigated previously. S. N. Vlasov 
and V. I. Talanov? considered symmetrical two-dimensional resona- 
tors with two types of variations of the mirror reflectivities including 
the gaussian and obtained solutions for the eigenvalues of the resonator 
integral equations. N. G. Vakhimov? investigated the natural resonant 
frequencies and field distributions of symmetrical resonators with 
gaussian VRM by using an asymptotic method of solution to the wave 
equation subject to impedance boundary conditions. N. Kumagai and 
others* investigated Fabry Perot resonators with VRM of finite dimen- 
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sions by solving numerically the resonator integral equation for differ- 
ent mirror reflectivities. Y. Suematsu and others‘ studied beam wave- 
guides with gaussian transmission filters for the improvement of the 
stability of beam transmission. 

In this work we investigate nonsymmetrical circular resonators con- 
sisting of one URM and one VRM. The radii of curvature of the 
mirrors are arbitrary. The reflection coefficients of the VRM are as- 
sumed to have gaussian profiles in the radial direction. For such 
resonators with infinite mirrors, solutions to the resonator integral 
equations are obtained in terms of Laguerre functions with complex 
arguments. The modal fields decrease off-axis very rapidly and con- 
sequently these solutions are also applicable to resonators with finite 
mirrors. 

Resonators of the type considered seem to be particularly suitable 
for high-gain lasers as for example COs lasers. It is shown subsequently 
that for the fundamental TE)» mode, the spot sizes obtainable are 
considerably larger than those obtained with URM resonators of the 
same length and the same fundamental mode threshold gain ratio. 
This should result in a larger mode volume utilization. Furthermore, 
the power loss due to the transparency of VRM could be utilized as 
the power output. 

In the following sections the solutions for the resonator modes and 
eigenvalues of nonsymmetrical resonators are obtained. It is shown that 
the solutions for symmetrical resonators consisting of two identical 
VRM are readily obtainable as special cases. Mode-stability criteria 
are established as functions of the resonator geometries and VRM 
parameters. The spot sizes of the fundamental TE,» mode are com- 
puted as a function of the threshold-gain ratio for a variety of pa- 
rameters. A comparison is made between the obtainable spot sizes 
with Fabry Perot resonators with VRM and URM. We show that 
much larger spot sizes can be achieved with the VRM resonator. 
We show also that the spot-size diameters of VRM mirrors depends 
basically on the threshold-gain ratio and on the mirror-curvature tol- 
erances. A specific design of a Fabry Perot resonator with a VRM is 
examined and the expected performance is presented. 


II. NONSYMMETRICAL RESONATORS 


2.1 Solutions to the Integral Equation (IE) 


The geometry of the nonsymmetrical resonator is shown in Fig. 1. 
It consists of one mirror with variable reflectivity (M1) and one mirror 
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MIRROR 1 (M13) MIRROR 2 (M2) 
Tr =T,exPp(—£p*) Io=40 


Fig. 1—Nonsymmetrical resonator. 


with uniform reflectivity (M2). The separation between the mirrors 
is d, and the radii of curvature are designated by R, and R, respectively. 
The reflection coefficients of the VRM, I, is assumed to vary in the 
radial direction p as follows: 


lr = Ty exp (—6p’) (1) 


where I, and # are constants with | Ty | S$ 1. 

The reflection coefficient of the URM is assumed to be unity. (A 
reflection coefficient different than unity can readily be included in the 
solution.) 

The integral equations for this resonator are obtained in a manner 
analogous to a URM resonator, by imposing the condition that the 
field should reproduce itself after a round trip. With the azimuthal 
dependence for the electric field H(p, ¢) = exp (—j)F;(p), the two 
simultaneous integral equations are: 


KP PO(p,) = i exp (—jkaM | F(o,) exp (-iM (ei + 920)/21 
Qo 
‘J (M pip2) pi dp, (2) 
KP FE (p,) = 7" exp (—jkd)MT> exp (—8p;) F (pe) 
0 


“exp [—jM (gi p1 a 92p2)/2] -J (Mp; p2) p2 Ape; (3) 
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where FS? (p,), FS” (p2) are the radial field distributions at (M1) and 
(M2) respectively, K{” and K{” are the associated eigenvalues, J, 
is a Bessel function of order f, MM = 22/)d, ) is the free-space wave- 


length and 
d 


The integral equations (2) and (3) are solved by using the self- 
reciprocal properties of the Laguerre functions on the Hankel trans- 
form.°'’ These properties are 


gi 


| v’*! exp (—Br)Li(ax’) J, (vy) (ay)? dx 


= ye — ater em (soli) 


where L; is a Laguerre function of order ¢, n. 


Based on equation (6), the modal solutions to the integral equations 
are: 


7 (oy = exp (—¥193/2)Liaipi) (Wan pi)‘, (7) 
(pz) = exp (—v2p2/2)Ln(ap2)(V ats 2)! (8) 


After some algebraic manipulations, the following relations are ob- 
tained for the parameters. 


% =a, + B, (9) 
2 =i iV = nak, : 
ay + E oe i(«, +) | (10) 


It is convenient to express a; in terms of a complex trigonometric func- 
tion as follows 


MM 
ay arn cosh 6 (11) 
with 6 = A + jA. In equations (9) and (10), A and A are related to 
the resonator geometry and reflectivity parameters by 


2928 
ae 





sinh A cos A = (12) 
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cosh A sin A = (29,9. — 1). (13) 
Furthermore 
Y2 =a Qe; (14) 


(cos A + j sinh A) 
sin A + cosh A 





ag = Mz (15) 


The associated eigenvalues K$” and K §” are: 


: om f+1 
Ky? = Tyexp (ih) | iow C3] exp (—né), (16) 
“Y2 





(2) __ 4h n 7292 7 se i 
Ke = exp (—jkd)j E (8) + | exp (—7n6). (17) 

The eigenvalue K, which gives the decrease of the reflected field 
after a double pass is the product of the above two eigenvalues given 
by equations (16) and (17). 


K,=KPKP = (-1)’j exp (—72kd)T, exp [—(2n + €+ 1)6]. (18) 


The eigenvalues K, are exponentially decreasing with ¢, n. The largest 
eigenvalue is obtained for n = ¢ = 0, corresponding to the fundamental 
TEM,,. mode. The next eigenvalue corresponds to the THM,,, mode 
¢ = 1, = O). Since the eigenvalues are related to the power loss, the 
fundamental mode selectivity will depend primarily on the eigenvalues 
for the TE,,o and TE,» modes. 

It is of interest to examine the special case g. = 1, which corresponds 
to a resonator with a perfectly reflecting planer mirror M2. This reso- 
nator is completely equivalent to a symmetrical resonator consisting 
of two identical VRM separated by 2d. Both the eigenvalues and the 
fields are the same with the fields beyond d being equal to the reflected 
fields of the nonsymmetrical resonator. 

The modes of the nonsymmetrical resonator are orthogonal at the 
uniform mirror M2, since y2 = a2. Howerer, at the VRM neither the 
incident modes nor the reflected modes are orthogonal. This is shown in 
Appendix A. In addition it is shown that for any particular mode, the 
ratio of the reflected to the incident power at the VRM is precisely 
equal to the absolute value square of the eigen value K, . Physically this 
condition corresponds to conservation of power. 


2.2 Stability Criteria 


For a resonator made to be stable, it is necessary that the exponential 
factors y, and y2 in equations (7) and (8) be finite and have a positive 
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real part. Both these factors are dependent on cos A and g. . The limits 
of the stability regions are thus: (a) g. = 0 and (b) cos A = 0. The 
second condition can be expressed in terms of the resonator parameters 
by using equation (13). 


_ |, _ (29:92 — | 
cos A = E (Paes — 3 A (19) 


and the second limit of the stability region is 


a 1 + cosh AL 


291 my) 


J2 

Equation (20) contains the special case of the uniform reflectivity 
resonator (cosh A = 1). For this special case equation (20) reduces to the 
stability criterion derived by G. D. Boyd and H. Kogelnik.® 

In Fig. 2 illustrative stability diagrams are shown as a function of 
g; and g. with exp (2A) as a parameter. (The choice of this parameter is 
discussed subsequently.) 

A few special cases are considered 


(1) g. = 0 
This resonator is stable for all values of g., except g. = 0. 
This resonator is unstable independent of the curvature of M1. 
(212) gi = go = 0. 
This is the very special case of the confocal resonator and is in 
general unstable, except for a URM resonator (8 = 0). 
(v0) 1 =o = 1. 


This is the Fabry-Perot resonator and is stable with a variable re- 
flectivity mirror. 


2.3 The Threshold-Gain Ratio 


To sustain oscillations in a laser resonator a necessary condition is 
that the active medium should have enough gain such that after a 
double transit the field has the same amplitude. This condition can be 
written in terms of the eigenvalues of the resonator modes as” 


where G is the power gain per double transit. 


In particular for the TEM, ,, and TM,,. modes, equation (21) can be 
written using equation (18) as: 


Go.olG exp (—2A) = 1, (22) 


OPTICAL RESONATORS 2355 


2.0 


Zerhss 
1.5 = “ 
G1.9 = THRESHOLD GAIN 4 A 
"FOR TEMj,9 MODE --e?74=6 
Go,o= THRESHOLD GAIN 
FOR TEMo,o MODE UNSTABLE 
1.0 e2@A =6~_ e2A 9-7 
= erties x 
N N 
fa 
Se UNSTABLE STABLE 
TCT 
I 
I! 
N ie) 
™D 
a 
a 
= STABLE 
eS ee e2A= 3 
Mi(VRM) _\M2(URM) 
-e2h =0 
-1.0 2 
-1.5 
UNSTABLE UNSTABLE 
-2.0 





-2.0 =1.5 -1.0 -0.5 O 0.5 1.0 1.5 2.0 
CURVATURE, gi = (1- d/R;) 


Fig. 2—Stability diagrams. 


G,.01'3 exp (—4A) = 1, (23) 


where Go,o and Gj, is the threshold-power gain required for oscilla- 
tion in the respective modes. A quantity of interest is the threshold- 
gain ratio, ¢ defined by: 





= Fue sae OA), (24) 
0,0 


This ratio is a measure of the gain tolerance required for oscillation 
in the dominant TEM ,o mode, and is independent of Tp. 

The threshold-gain ratio may also be expressed in terms of the loss 
per round trip Lo,o for the TEMo ,o mode. Since 


Lo.o = 1 — i exp (—24). (25) 
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The threshold-gain ratio can be written, 


ro 


es 1 SS Lee 


(26) 

Equation (26) is shown in Fig. 3 as a function of Lo... It is evident 
that the threshold-gain ratio increases with the loss and hence better 
mode discrimination is obtained as the loss increases.” Furthermore, the 
power output is related to the power loss per transit. Therefore different 
values of Ij can be used to shape the spatial distribution of the power 
output. 

A comparison is made (similar to that in Ref. 1) between the threshold- 
gain ratio of a Fabry-Perot resonator with URM and a resonator with 
VRM as a function of the loss per transit. 

Based on the Vainshtein resonator theory’’ for the URM resonator 
the threshold gain ratio 


1 [(v1 ,0/00.0)?—1]) 
i= (45) (27) 


where Lg is the diffraction loss for the TEo,o mode and vy», is the first 
nonzero root of J,(v) = 0. 

Equation (27) is also plotted in Fig. 3. It is evident from this figure 
that the threshold-gain ratio as a function of the diffraction loss is 
higher for the Fabry-Perot resonator with uniform mirrors than the 
corresponding ratio for the variable reflectivity resonator. This con- 
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Fig. 3—Comparison of uniform and variable reflectivity mirror resonators. 
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clusion was previously reached by Vlasov and Talanov,? who also 
showed that the highest threshold-gain ratio is obtained with confocal 
resonators with uniform reflectivity mirrors. However, the high thres- 
hold-gain ratio is only of primary importance for low-gain lasers, 
where the output power obtained by partial transmittivity of the 
mirrors is only a fraction of the power lost by diffraction. For high- 
gain lasers, however, the mode utilization volume is of prime impor- 
tance and the threshold-gain ratio can be kept at a specified level by 
the proper choice of the loss per pass. For such lasers the resonators 
with variable reflectivity mirrors have the advantage that the power 
loss which is necessary for mode discrimination can also be utilized as 
the power output. The mode volume utilization aspect is discussed later. 


III. COMPUTED TEM) 9 MODE CHARACTERISTICS 


3.1 Spot Sizes 

The TEM,,o mode is of particular interest since it is the funda- 
mental mode having the highest eigenvalue and hence the lowest loss. 
For this mode the field distributions are gaussian with quadratic phase 
variations. Specifically the field distributions for the TE.) mode 
from equations (7) and (8) are 


Fj, = exp (-# [exp (— A) cos A + jsinh Asin alot} ; (28) 
2 


F,, = exp {i [exp (A) cos A + jsinh Asin alot} : (29) 


= _Mge (228 A + jsinh 4) ‘ 
Bra = exp { 2 \sin A + cosh A /”f ? (30) 
where F’;; and fF, are the field distributions of the incident and reflected 
fields at M1, and F’,» is the reflected field at M2. 

The reflection coefficient can be expressed in terms of A and A is 
by using equation (12) as 





T = I, exp (—M/2g, sinh A cos A p;). (31) 
The eigenvalue for the TEM ,o mode is 
Ko = To exp (—j2kd) exp [—(A + jo)] (32) 


with 


Y= a cos A. (33) 
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The amplitudes of the field distributions at the mirrors are com- 
pletely characterized by the spot sizes defined by that radius when the 
above quantities assume the value of 1/e. Since the exponents in equa- 
tions (28) through (31) are proportional to M, it is convenient to intro- 
duce the Fresnel numbers of the spot sizes. For example, for F’;, the 
spot size is defined by 


OE een eee (34) 
Age 


or 
_ 2 exp (A) 

Mine x cosA ”” (35) 
where N,;, = p,:/\d is the Fresnel number. The corresponding Fresnel 
numbers of equations (29) through (31), N,, , Ni. , N, are defined 
in an analogous manner. 

The above Fresnel numbers have been computed as a function of the 
threshold-gain ratio = exp(2A), with the radii of curvature as param- 
eters. Two types of resonators were considered: (7) a resonator with 
a uniformly reflecting plane mirror and a curved mirror with variable 
reflectivity with radius of curvature as a parameter, and (7) a resona- 
tor with plane mirror with a variable reflectivity and a uniformly 
reflecting mirror with radius of curvature as a parameter. 

For resonator (2), Figs. 4 through 6 show the Fresnel numbers of: 
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Fig. 4—Spot size of incident beam, Niz. 
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Fig. 5—Spot size of reflected beam, Nie. 
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Fig. 6—Spot size of variable reflectivity mirror, Nm. 
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the incident beam at M1, Nix, the reflected beam at M2, N,2. and the 
Fresnel number of the variable reflectivity mirror N,,. Figure 7 shows 
the phase of the eigenvalue yo, equation (33). Figures 8 through 10 
show the corresponding quantities for the type (7) resonator. The 
phase yo is the same as in Fig. 7 but with g, and gz interchanged. 

A comparison of the characteristics for the two types of resonators 
shows that the most pronounced differences are when either of the 
mirrors have curvatures g; or gz = 0.5. Larger spot sizes are obtain- 
able with the type (7) resonator. The Fabry-Perot resonator for which 
gi and gz = 1.0 is a special case for both types. It also may be noted 
that a large increase in the spot size occurs when one of the mirrors is 
slightly convex, e.g., g1 or ge = 1.01. This increase is caused by the 
curvature of the resonator mirror which approaches the unstable 
region, Fig. 2. 

For a finite resonator, the resonator diameter will be limited by the 
minimum obtainable reflectivity at the mirror edges. At the spot-size 
diameter the reflection coefficient of the mirror has the value 1/e. The 
spot-size diameter may be considered as a measure for the diameter of 
the VRM. The Fresnel number of the spot size of the incident beam, 
which for a variety of geometries is the maximum spot size of the 
beam along the resonator is related to the Fresnel number of the 
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Fig. 7—Phase of the eigenvalue, Ko,o. 
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Fig. 8—Spot size of incident beam, Ni. 
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Fig. 9—Spot size of reflected beam, N-2. 
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Fig. 10—Spot size of variable reflectivity mirror, Nm. 


mirror spot size by: 


ve = [exp (24) — I} =t— 1. (36) 


For a resonator with finite dimensions to be a good approximation to 
the infinite resonators, it is necessary that beam power outside the 
mirrors be small. To obtain an estimate of this power, a resonator is 
assumed with a diameter equal to the mirror spot size diameter. 

The ratio p, of the incident power outside the mirror spot-size diam- 
eter to the total incident power is from equations (28) and (1) given 
by 


/ yp OP [—M/2g2 exp (— A) cos Apilp: dpy 
p= I. (87) 
[exp [—/292 exp (—A) cos Aptos dos 


After performing the integration and substituting equation (12), 
this ratio can be written as 


p = exp {—2/[exp (2A) — 1]}}. (38) 
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It readily follows from equation (88) that for a threshold-gain ratio 
¢ = exp(2A) smaller than 1.48, p is less than one percent. 

A comparison is made between the characteristics of Fabry-Perot 
resonators with one large (such that the diffraction loss is negligible) 
uniformly reflecting mirror and with the other mirror being either of 
uniform or variable reflectivity. For the resonator with VRM, the 
Fresnel number for a given diffraction loss has twice the value than 
that if both mirrors are of the same size. Figure 11 shows the Fresnel 
number of the uniform mirror as a function of threshold-gain ratio. 
The curve is based on the Vainshtein resonator theory.?° In the same 
figure is also shown the Fresnel number of the spot size of the incident 
field Nj. It is evident from this figure that the spot-size diameter at 
the VRM is considerably larger than the diameter of the uniform 
reflectivity mirror for the same values of t. As an example the special 
case of a resonator with a uniform mirror with a Fresnel number of 
two is considered. For this resonator, the field at the mirror has been 
computed by T. Li.’ Though the field distribution for the TEMo,9 is 
not gaussian, for comparison purposes the Fresnel number of the spot 
size (based on the 1/e value for the field) is estimated to be about 
1.6. For the same value of ¢ the Fresnel number of spot size for the 
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Fig. 11—Comparison of the beam sizes of Fabry-Perot resonators. 
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VRM resonator is 5.2. For lower values of ¢ the difference in the spot 
sizes is even more pronounced. One of the advantages of VRM reso- 
nator is therefore the larger spot sizes and hence the large potential 
for mode volume utilization. 


3.2 Field Distributions in the Resonator 

For efficient mode volume utilization, the field along the resonator 
should be reasonably uniform. The uniformity of the fields is strongly 
dependent on the mirror curvatures. Referring to Fig. 1, let B1 be the 
reflected beam from M1 and B2 the reflected beam from M2. The 
functional dependence of the two beams on the longitudinal z coordi- 
nate has been obtained from the fields at the mirrors, and is given by 
the following equations 

















Bl = exp[—1.@)] 5, (39) 
B2 = exp [—v2@)] 5 (40) 
with 
d\’ .d gt oAka 
204(2) exp (A) cos A + i; (exp (A) + asin A) 
+ a’ cos’ A — 2g, “ (exp (A) sin A + » |} 
n@) = ————~Texp (a) Fasin dy +aresta AY) 
a=I1+ 29(4 as 1) ; (42) 
and 
M |2 (4) exp (A) cos A + (s)he (A)b — sin A)’ 
Ua\ 7 ss p J a2 p 
+ cos? A + 7 g : [exp (2A)b + 2 exp (A) sin A(g, — 1) — u} | 
Y2(z) = d d 2 
| exp (ay(o + a -) + (4 = 1) sin a 
+ (4. —_ 1) cos’ A 


(43) 
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with 
b = (2g. — 1). (44) 


The real part of equations (41) and (48) has been evaluated as a 
function of z, in terms of the spot-size Fresnel number Ni(z) and 
No(z) defined in accordance with equation (34) as Ni,2(z) = 2/ 
y1,2(z)Ad. For a resonator with g. = 1.0, Fig. 12 shows the spot-size 
Fresnel number as a function z/d for a number of parameters. 

It is characteristic of resonators with VRM, that minimum-beam 
spot size even for symmetrical resonators does not occur at half the 
mirror spacings in contrast to resonators with uniform reflectivity 
mirrors. In Fig. 12 this characteristic is particularly evident for the 
equivalent confocal resonator gi = 0.5. 

The uniformity of the beams along the resonator increases with the 
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Fig. 12—Spot size of the beams along the resonator, 
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increasing value of g; up to g,; = 1.0 which corresponds to the Fabry- 
Perot resonators. For this resonator the beams are most uniform. For 
the higher value of gi(g; = 1.01) the uniformity of the beams within 
the resonator decreases. 


3.3 Minimum Spot Sizes 
The uniformity of the beams within the resonator depends on the 
locations of the minimum spot-size diameters. The beam diameters are 
more uniform if the virtual minimum spot-size diameters occur at 
distances, away from the mirrors, which are large in comparison to 
the separation of the mirrors. The positions of the minimum spot sizes 
are obtained either by determining the maxima or by setting the imag- 
inary parts of equations (41) and (43) equal to zero. Either condition 
gives the same result (i.e., at the minimum spot-size positions the 
beams have constant phase). 
The minimum positions for the two beams (2:/d) min and (22/d) min 
are: 
(*) = [2g. — 1 — exp (A) sin A] _ (Ag 
@/ win 292 [exp (2A) + 2(1 — 29.) exp (A) sin A + (1 — ona (45) 
and 
Za 3 [exp (A)(2g. — 1) — sin A] " 
(@) = 202204) [exp (2.A)(1 — 2g.) + 2(1 — 292) exp(A)sin A + 1]. 
(46) 
The Fresnel numbers of the minimum spot sizes [Ni(z)]min and 
[No(z) |min are: 


2g. cos A exp (A) 


[V(e) Jaa = Toxp QA) + 21 — 2g,)exp(A)sm ALO ag) | 
and 
[N2(2)latn = agp es oe a 


mlexp (2A)(1 — 292)” + 2(1 — 2g.) exp (A)sinA + 1] 
The minimum positions and minimum spot sizes have been computed 
for a resonator with a VRM and variable radius of curvature and a 
uniform plane mirror gg = 1.0. Since this resonator is equivalent to a 
_symmetrical resonator with two VRM, the two minimum positions 
are mirror images with respect to the uniform mirror, and the minimum 
spot sizes are the same. For this resonator 


UN s(@)lein = (Nel@)Imin = 5 rook Re Sin Ay (49) 
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For specified A equation (49) has a maximum for cosh A sin A = 1, 
which from equation (13) corresponds to the Fabry-Perot resonator 
(91 = gg = 1.0). The corresponding Fresnel number Ny is: 


1 

Nae 7 sinh A (50) 

Figure 13 shows (22/d) min as a function of g; with t = exp(2A) as 

a parameter. As gi increases, so does (22/d)min assuming relatively 

large values in the vicinity of gy = 1.0. The large values of (22/d) min 

explain the uniformity of the beams in the Fabry-Perot resonator. 

Figure 14 shows the dependence of the minimum spot on the mirror 
curvature g; with ¢ as a parameter. 


3.4 Dependence of the Spot Sizes on the Curvatures of Spherical Mirrors 
The previous calculations show that large spot sizes are obtainable 
in the vicinity of the instability region. How critically the spot sizes 
depend on mirror curvatures is of importance. 
The Fresnel number of the spot size for the incident beam Nix 
is directly related to the Fresnel number of the VRM by equation (36). 
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Fig. 13—Location of the minimum spot size. 
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Equations (12) and (18) give the relation between the Fresnel number 
of VRM, the mirror curvature and A. Solving these equations for gi 
gives: 


_ etnh A[(rN,, sinh A)’ — g2]* + oN, 
: 27g2N m 


where N,, is related to 8 in equation (1) by N,, = (1/BaAd). 

Equation (51) has been computed as a function of go with t as a 
parameter. Figures 15 through 18 show the computed characteristics 
for N», = 10,20,40,100. 

The critical dependence of the beam spot size Nj on the mirror 
curvatures is evident from these figures, particularly as N» in- 
creases. A small change in g; or ge results in a large change in 
t and there is a very large change in the beam spot size Ny for a 
specified Ni, . 

The conclusion based on these computations is that though large 
Fresnel numbers for the beam spot sizes are in principle possible, the 
critical tolerance requirements for the mirror curvatures may set a 
practical limit on spot sizes relative to those obtainable with a Fabry- 
Perot resonator with one VRM. 


(51) 
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3.5 Resonator Design 


As an application, a resonator design is considered for a COz laser. 
In view of the realizable high gain per pass, the power loss per pass 
and the related power output should be large. A Fabry-Perot resonator 
with a VRM mirror seems to be most suitable for this application. The 
remaining parameter to be specified is the threshold-gain ratio, ¢. 
This ratio should be as low as possible in order to obtain large beam 
diameters (see Fig. 11). For the fundamental TEMp, 9 mode operation, 
the limitation on ¢ is based on the accuracy with which the gain can 
be controlled. A value of ¢ of 1,2 is assumed. 

Using the above parameters in equations (28) and (31) (Figs. 4 
and 6) the Fresnel number of the spot size of the VRM is 38.35 and 
that of the incident beam at the VRM is 7.65. For a resonator with 
length d = 100 cm and for a wavelength X = 10 cm (10 micron), 
the radius of the spot size of the VRM, a,, = 1.95 cm. 

Some of the characteristics of this resonator are shown in Fig. 19. 
Illustrated is the dependence of the power-reflection coefficient I’ as a 
function of the normalized radius p/a,,. Also shown is the normalized- 
incident power density at the VRM, and power loss density at the VRM 
for different values of the reflectivity at the center, I> , as functions of 
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Fig. 16—Relation between mirror curvature parameters g, and ge and Fresnel 
number Ni: of spot size at M1. 


1.02 


- 1.00 
x 
> 
TU 
| 
— 0.98 


2 
ive) 
o 


CURVATURE OF M1, Q1= 
° 
wo 
aS 





1.00 1.02 1.04 1.06 1.08 1.10 1.12 
CURVATURE OF M2, go =(1-d/R2) 


Fig. 17—Relation between mirror curvature parameters gi and gz and Fresnel 
number Nia of spot size at M1. 


OPTICAL RESONATORS 2371 





10 1.42 


1,00 1.02 1.04 1.06 1.08 1. 
CURVATURE OF M2, go = (1-d/Ra) 


Fig. 18—Relation between mirror curvature parameters g: and gs and Fresnel 
number Ni: of spot size at M1. 


p/Gm +. The power-loss density is the laser-power output when the 
absorptivity of the VRM is zero. Figure 20 shows the ratio of the power 
loss to the incident power as a function p/a,, with IT as parameters. 

The actual diameter of the VRM can presently be determined only 
by assuming that a finite resonator will behave similarly to a resonator 
with infinite mirrors when the beam power outside a certain diameter is 
small. For the resonator considered, 0.5 percent of the incident beam 
power is contained outside the mirror radius of 0.73 a,, and one percent 
outside the radius 0.68 a,, which corresponds for the above value of a,, to 
1.41 cm and 1.31 cm radii. For a resonator with a VRM of radius a,, 
the perturbation of the fields should therefore be very small. 


Iv. CONCLUSIONS 


The characteristics of optical resonators with gaussian radial varia- 
tions of the mirror reflectivities have been investigated. These variable 
reflectivity mirror (VRM) resonators seem to be particularly suitable 
for high-gain and high-power laser application such as the 10.6 micron 
COQ, laser. For the fundamental TEMo,» mode generation, these 
resonators have the advantage in comparison to conventional resonators 
that larger beam spot sizes are obtainable (with better mode-volume 
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Fig. 19—Power distribution at the variable reflectivity mirror. 


utilization) and the power loss necessary for mode discrimination can 
be utilized as the power output. 

The factors limiting the spot are the threshold-gain ratio and the 
mirror-curvature tolerances. 

The Fabry-Perot resonator with a VRM is stable and furthermore 
the field distribution along the resonator is more uniform in diameter 
relative to other resonator geometries. A specific design of such a 
resonator with a gain threshold ratio (G1,o/Go,o) of 1.2 shows that 
a spot size Fresnel number of 7.65 with a power loss (or power output 
depending on the absorptivity of the mirrors) as high as 40 percent 
of the incident power are obtainable. 

In this investigation it was assumed that the mirrors are infinite, 


OPTICAL RESONATORS 2375 


The results presented should be a good approximation to a finite 
resonator when the beam power outside a certain circular region 
has a negligible value (say less than one percent). 
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APPENDIX A 


Integrals of Laguerre Functions 


In order to determine the orthogonality and power relations for the 
modes in resonators with variable reflectivity mirrors, the following 
integral I;,,, of product of Laguerre functions is evaluated. 


Ii, = 2 [ exp (—sp) LL (ep Lp) 6°"! dp, 


= a exp (—st)Li,(at)Li(Bt)t' dt. (52) 


For the special case mm = n the integral is known.1+*? The integral 
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Fig. 20—Ratio of the loss to the incident beam power as a function of p/dm. 
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(52) is evaluated by considering this integral as a Laplace transform 
of two functions fi(¢) and f.(t) and using the Faltung relation 


[ew -sontono at = 54 [OR — ads 68) 


where F,(z) and F.(z) are the Laplace transforms of fi(t) and f(t), 
y is a constant with Re(s) > Re(y) > 0. 
Let 


A) = Thad) = Yo (-1'G") (ai (54) 


and 


f(t) = Ln(Bt)t'. (55) 

The Laplace transform of equations (54) and (55) are readily ob- 

tained. Furthermore with the transformation £ = 1/z together with 
equation (58), equation (52) reduces to: 


Fg = PAD 1' ZL § (1) [(@ — ae — tae. 
(56) 


In equation (56) the contour of integration encloses the point 
= 1/s. Equation (56) is therefore evaluated by determining the 
residue at 2 = 1/s, which yields 


Taw = GAD (2) Eta — pote — af — rare. OD 


Equation (56) can be expressed in terms of Jacobi polynomials’ by 
rotating and translating the coordinate system with the result that 


(n+ £)! 


lin = alee (s — a — £)"(s — 6)” "PR (a) (58) 
with 
_ s + 2a8 — s@ + 8) 
7 = s(s Sy 6) (59) 
and P‘:""" is a Jacobi ploynomial defined by *” 
SMa aad S ia = (Cl — 2)" + ay]. 60) 
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As an application let F4(p) and F{*(p) designate the reflected fields at 
the VRM given by equation (7) and the* indicate the complex con- 


jugate. The integrals which enter in the evaluation of the total reflected 
power due to several modes of the same index #, can be written as 


[ PSPs de = 2 ..), (61) 


where a, is given by equation (11). Using equations (7), (9), (11), (12) 
and (58), it follows that 


a p= page cont) Knee Lee ae (ee Fee 


n! M cos A 
sin a] i | nba ha r fontam 
{i Bae cos A exp (—jA)| Pr" "(n) (62) 
with 
-|1 42 sin h sin KA 63) 
oe cos’ A 


For stable resonators with VRM equation (62) is not equal to zero for 
m * n. Hence, the total reflected power is in general not equal to the 
sum of the powers of the individual modes. 

The reflected field F{?(p) for any particular mode is related to the 
incident field F‘’(p); by reflection coefficient (1). The evaluation of 
the corresponding integral (61) for the incident fields yields the same 
value for y. Furthermore setting m = 7 the following relation if obtained 


By . = T? exp [—2A,(2n + £+D] =| Kz [. (64) 


The meaning of equation (64) is that the ratio of the reflected to 
incident power for a particular mode is precisely equal to absolute value 
of the eigenvalue squared. 
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Projecting Filters for Recursive Prediction 
of Discrete-Time Processes 


By ALLEN GERSHO and DAVID J. GOODMAN 
(Manuscript received June 19, 1970) 


We consider the design of time-invariant recursive filters of constrained 
order for one-step prediction of discrete-tume stationary processes. For 
this purpose, we wntroduce the projecting-filter concept. An nth-order 
projecting filter for a gwen process has the characterizing property that 
with the process as input, the output at each instant is the optimal linear 
combination of the n previous output and n latest input samples. This 
definition implies that (1) the filter is stable, (21) any n + 1 consecutive 
samples of the prediction error sequence are mutually uncorrelated, (122) 
the mean-square prediction error is at least as low as that of the best nth 
order nonrecursive predictor, and (tv) tf the spectral density of the process 
as rational of order 2n or less, then the nth-order projecting filter coincides 
with the optimal (unconstrained) linear predictor. 

A design algorithm for nth-order projecting filters iteratively generates 
successive sets of coefficients of a time-varying nth-order recursive filter 
which asymptotically approaches the desired time-invariant filter. The 
only input data needed for the algorithm are the autocovariance coefficients 
of the process to be predicted. When the order of the filter 1s matched to 
the order of the process, the time-varying filter 1s the same as the Kalman 
predictor. The algorithm has yielded effective projecting filters for several 
specific processes. Our results indicate that near optimal prediction may 
often be obtained with filters of order lower than that of the optimal uncon- 
strained predictor. 


I. INTRODUCTION 


Although the optimal linear predictor of a random process must 
make use of the entire past of the process, any practical predictor can 
store only a finite number of data. One way to design a finite storage 
predictor is to determine the best linear combination of the n latest 
sample values of the process. However, for many processes, a large 
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value of n is required to achieve a performance quality approaching 
that of the unconstrained optimal linear predictor. An alternate 
approach is to find the best recursive predictor constrained to operate 
only on the n latest data samples and the n latest predictions. This 
approach has the advantage of using condensed information from the 
entire past of the process with the consequence that optimal or near 
optimal prediction can often be achieved with a relatively small 
amount of storage. 

The purpose of this paper is to introduce the projecting-filter ap- 
proach to recursive prediction and to present an algorithm for the 
design of projecting filters that has yielded effective low-order pre- 
dictors not otherwise attainable. So far, a complete theory of project- 
ing filters has not been established. We do not yet know how broad 
is the class of processes which possess projecting filters of a given 
order; nor have we determined the class of processes for which our 
design algorithm is effective. However, we can report very favorable 
experience in the design of projecting filters for a variety of specific 
processes. We have also established some important theoretical proper- 
ties of projecting filters. 


1.1 Optimal and Finite Memory Predictors 

In certain special cases the optimal (least mean-square error) un- 
constrained predictor is realizable with a finite-storage filter.t In 
particular, for an nth-order autoregressive, or wide-sense Markov, 
process the optimal unconstrained predictor is a finite-memory non- 
recursive filter operating only on the n latest data samples. More 
generally, the optimal unconstrained predictor of any stationary proc- 
ess whose spectral density is rational of order 2n may be implemented 
as an nth-order recursive filter. The characteristics of the optimal filter 
may be determined by applying the discrete-time form of Wiener’s 
spectral factorization technique. Even more generally, consider any 
nonstationary process which can be modeled as the response of an 
nth-order linear time-varying recursive filter to an uncorrelated noise 
input. The optimal unconstrained predictor is an nth-order time- 
varying recursive filter? which may be determined by use of the 
Kalman filtering equations,? or, more efficiently, by a generalization 
of the approach taken in Section VI of this paper. 

If a random process cannot be modeled as the response of an nth- 
order recursive filter to an uncorrelated input, then the optimal un- 
constrained one-step linear predictor cannot be realized by an nth- 
order filter. Nevertheless, it is realistic to preselect the desired order, 
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n, of the predictor and to seek the best recursive filter of this order. 
In this way the structure of the predictor is conveniently specified 
for digital filter implementation while only the 2n parameter values 
need be supplied according to the process to be predicted. Un- 
fortunately, with the least-mean-square error criterion, the con- 
strained-order prediction problem is a special case of the unsolved 
problem of Zz rational approximation on the unit circle.* No analytical 
solution is known and optimization search techniques are severely 
hampered by the multimodal nature of the error surface.* 


1.2 Projecting Filters 

In this paper we introduce the projecting filter principle of recursive 
prediction. Although the projecting filter is not a solution of the Le 
rational approximation problem, it has the local optimality property 
that at each step it forms the best linear combination of the available 
data. The term “projection” alludes to the geometrical interpretation 
of random variables as vectors in Hilbert space.* 7 Each prediction 
error of the projecting filter is a vector orthogonal to the n most recent 
inputs and the n previous errors. Hence the projecting filter performs a 
partial whitening of the input process. In this sense it approximates 
the action of the optimum unconstrained predictor, the error of which 
is a white-noise process—the innovations process of the input. If the 
input can be represented as the response of an nth-order filter to white 
noise, the nth-order projecting filter is the optimum unconstrained pre- 
dictor. For any process, the mean-square error of a projecting filter is 
never greater than the mean-square error of the optimum nonrecursive 
filter of the same order. Projecting filters are stable. 


1.3 An Example 


These properties of projecting filters are observed in the example of 
the eighth-order process {x;,} represented by 


Uy, = €& = 0.8¢,-1 + 0.56,—2 + 0.25¢€,_; = 0.6€,—4 — 0.26,_5 
+ O.lez_. + 0.46€,_7 = 0.08¢,_s 


in which {e,} is a stationary white-noise process with zero mean and 
unit variance. The power spectral density function of {a,} has zeros 
at the 16 points in the z-plane indicated in Fig. 1. The eighth-order 
projecting filter for {z,}, which is the optimum unconstrained predic- 


*The complexity of the error as a function of the parameters is evidenced 
by the work of R. S, Phillips’ on the corresponding continuous-time problem. 
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Vig. 1—Loeations of zeros of the spectral density function of an eighth-order 
process. The eighth-order projecting filter has poles at the zero locations that are 
outside the unit circle. 


tor, has poles at the eight locations indicated in Fig. 1 that are outside 
the unit circle. The pole positions of a seventh-order projecting filter 
are shown in Fig. 2. There are poles extremely close to all of the loca- 
tions outside the unit circle indicated in Fig. 1, except the one furthest 
from the origin. Figures 3, 4, and 5 indicate the pole locations of the 
sixth-, third-, and first-order projecting filters, respectively. The poles 
of these filters do not coincide with zeros of the power spectral density 
function of {2;}. 

Figure 6 demonstrates the projecting-filter mean-square-error per- 
formance for this process. Here the horizontal base line is the optimal 
unconstrained prediction error. The white bars indicate errors of opti- 
mal constrained order nonrecursive predictors and the shaded bars are 





Fig. 2—Pole locations of seventh-order projecting filter, 
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Fig. 3—Pole locations of sixth-order projecting filter. 


the errors of the projecting filters. It is significant that the error of 
the seventh-order projecting filter is extremely close to the optimum 
linear-prediction error; the ratio of the two errors is approximately 
1 + 10-7. By using the projecting filter approach to prediction, we have 
discovered a means of reducing predictor complexity with virtually no 
loss in accuracy. In addition, Fig. 6 shows the error resulting from 
low-order recursive filters and the advantages relative to nonrecursive 
prediction. 


1.4 Organization of the Paper 


The content of the paper falls into two categories. Some sections 
contain descriptive and analytic material relevant to predictors and 





Fig. 4—Pole locations of third-order projecting filter. 
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fim (2) 
Re(Z)—> 


Fig. 5—Pole location of first-order projecting filter. 


projecting filters in general and other sections pertain to the particular 
design method that has been used in synthesizing the predictors de- 
scribed in Section 1.3. Sections II, III and IV are in the first category; 
they define the prediction problem and the projecting-filter principle 
and focus attention on the essential properties of unconstrained pre- 
dictors and projecting filters. Section V introduces the design method, 
an iterative scheme based upon successive projections in Hilbert space. 
This technique leads to a time-varying filter that asymptotically tends 
towards the desired projecting filter. Section VI shows that when the 


NON RECURSIVE 
FILTER 
PROJECTING 
FILTER 


MEAN SQUARE ERROR 





UNCONSTRAINED 
(MINIMUM LINEAR) 


PREDICTION ERROR 


FILTER ORDER 


Fig. 6—Mean-square errors of projecting filters and optimal nonrecursive 
filters of orders 1 through 8. 
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order of the filter is matched to that of the process, the design algorithm 
converges and the projecting-filter approach results in an efficient 
analysis and design (equivalent to but simpler than the Kalman filter- 
ing equations) for the unconstrained optimum time-varying filter with 
a given initial state. Section VII presents a derivation of the design 
algorithm. 


II. PROBLEM STATEMENT 


We consider a purely-nondeterministic* stationary process {2} with 
known covariance function, 7, = Ext. We assume that the 
spectral density function of the process f(z) = 37,2" has no zeros on 
the unit circle, |z2| = 1. The purpose of this paper is to describe a new 
approach to the design of a stable one-step predicting filter with the 
nth-order recursive structure 


n—l nr 
Yu: = DS Qity-~ + b> DeYx-i- (1) 
i=0 t=1 
A natural measure of the performance of the predictor is the mean- 

square value of the prediction error 


Cxer = Lari — Yue (2) 


Because the determination of the optimum filter coefficients with re- 
spect to this criterion is an intractable problem of approximation 
theory, our design method is based on a different performance objec- 
tive. Rather than synthesize the least-squares nth-order recursive 
filter, we seek a stable time-invariant filter with the following 


Projecting property: With input {x,}, the output, yx, is, at each in- 
stant k, the least mean-square linear combination of the data {xxz, 
Lonty °° * yUe-nety Ue-1 °°" y Yn-n} currently in the filter memory. 


This implies that the filter coefficients a; and b; satisfy a set of linear 
equations involving the covariance functions of {z,} and {y,}. The 
autocovariance of {x;,} corresponds to the given data of the prediction 
problem but the cross-covariance between {z,} and {y,} and the 
autocovariance of {y;} are transcendental functions of a; and b;. It 
follows that an explicit solution for the coefficients from the con- 
straints imposed by the projecting property is not possible. An algo- 
rithmic solution is presented in Section VII. 


* See Ref. 1, p. 23. 
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III. UNCONSTRAINED PREDICTION 


We refer to the problem defined in Section II as a constrained-order 
prediction problem because the order, n, of the predictor is prespecified. 
Another problem, which we refer to as unconstrained linear prediction, 
has received considerable attention in the literature of stochastic 
processes.’ The optimum unconstrained prediction, 4,41 , of %41 is 
the least mean-square lincar combination of the entire past, x, , %,-1,°°° 
of {x,}. In the terminology of the Hilbert space description of random 
variables, 4,41 is called the projection of x,,, into the past of {a,}, and 
we thus adopt the following convenient notation: 


Lien —_ Pies | Ue» Up-1, °° “ys (3) 


When {z,} is gaussian, the projection coincides with the conditional 
expectation. 


3.1 The Error Process 


The error process {v,}, defined by 
Veet = Lne1 — Leary (4) 


is the innovations process of {2,}. It has the key orthogonality prop- 
erties: 


ll 


0, 1,2, +++; (5) 
Ly; 410,.-; = 9, a= 0,1, 2,---. (6) 


Lys 1X,-; = 0, a 


Equation (5), which characterizes the projection operation, indicates 
that the best linear predictor cannot make better use of the past of 
{v,}. Equation (6), a direct consequence of equation (5), shows that 
the error process is white noise. 


3.2 Stabslity 


The optimal unconstrained prediction, #,,, , may be characterized 
as the limit of an infinite sequence of constrained-order nonrecursive 
predictions: 


n-1 


Lia = lim ys Ninkn—s (7) 

no t=0 
where h,,(¢ = 0, 1, --- , ~ — 1) are the coefficients of the optimum 
nth-order nonrecursive predictor which may be calculated by means 
of well-known quadratic minimization techniques. The unconstrained 
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predictor is a stable function of the data in the sense that 


n-1 
lim >> hi, < ©. (8) 


no t=0 


This is proved in Section IV. 


3.3 Process Representation 


We say {2,} is of nth-order if it can be represented as the response 
of a stable recursive nth-order filter to white noise so that 


2d Qi L-4 = do Biex—i (9) 


in which @, or 8, is nonzero, {¢,} is a white-noise process, and La,z’ 
has no zeros in |z| S 1. If {x,} is of order n, it is known that there exists 
an nth-order recursive filter which generates {@,} in response to {2;}. 
The error process of this filter is {v,}, the innovations process of {2,}. 
If 26; 2° # 0 for |z| S 1, theny, = «&. 

Conversely, if {x} does not possess an nth-order representation of 
the form of equation (9), the best unconstrained predictor cannot be 
realized by an nth-order filter. To prove this we assume that such a 
realization does exist. That is, we assume 


n-1 n 
Lat = D> jtp-— + De ere (10) 
7=0 i=1 


This combined with equation (4) implies 


n-1 n—1 
Lee1 6S (d; Ci41)Bpas = tpg oa Ci 410K-3 (11) 
7=0 7=0 


which shows that {2,} is in fact the response of an nth-order filter to 
the white-noise process {v,}, which is a contradiction. 


IV. PROJECTING FILTERS 


4.1 Orthogonality Properties 

We have shown that an nth-order recursive filter cannot perform op- 
timal unconstrained linear prediction of a process of order greater 
than n. With such a process as input, the error process {e,}, of an nth- 
order filter will necessarily have a higher mean-square value than that 
of the innovations process and {e;,} will fail to meet the orthogonality 
conditions of equations (5) and (6). However, when the nth-order 
predictor possesses the projecting property defined in Section II, its 
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error process satisfies some but not all of the orthogonality conditions 
met by innovations process. In particular, the projecting property 
requires that 


UR = Pea | Ly y Ue-ry °° * » Ueonti > Yr-r > °°" 4 Yr-nt (12) 
which is characterized by the orthogonality conditions 
Key, 1:t,-; = 0, 7=0,1,-:-,n—1; (13) 
Hez.1e,-1 = 0, 7=0,1,---,n-1. (14) 


Note that in this case, equation (14) is not a direct consequence of 
equation (13). In fact equation (13) is satisfied by the error of the 
optimum nth-order nonrecursive filter, while equation (14) is not sat- 
isfied by this error unless {x} is an nth-order autoregression, that is, 
an nth-order process with 6; = 0 for7z > 0. 


4.2 Stability 


Projecting filters are inherently stable. In fact, some kind of sta- 
bility property is implicit in any statement of steady-state properties 
of a time-invariant filter. In this paper we say that a filter is stable if 
its impulse response is square summable, which implies if the spectrum 
is rational, that the filter transfer function is analytic on and in the 
unit circle. We assume that the predicting filter has zero in each mem- 
ory element prior to k = 0 at which time {xz;} is applied to the input. 
The projecting property stated in Section II implies that in the limit 
as k > o, y;, tends toward the projection indicated in equation (12). 
Thus in the limit, the orthogonality conditions of equations (18) and 
(14) are satisfied from which it follows that Hez.sy,; - 0 and since 
Yu + Crea = Une, 


lim [Ey;; ++ Ee;..1] = Exes = 1o 
k~*00 
from which we infer 
lim sup Ey; < 7. (15) 


k-00 


We also know that the filter output for each k 2 0 is the finite sum 


k 
=D gute (16) 
7=0 


in which g; is the filter impulse response. Equations (15) and (16) 
imply the existence of a positive number c, which bounds the mean- 
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square output: 
Ey,<c,  forallk. (17) 
The existence of this bound leads to the following 


Theorem: If a filter with impulse response g; is a projecting filter, rt 
is stable in the sense that . 


> 9 < 0, (18) 


7=0 
Proof: In terms of f(z), the power spectral density function of {z;}, 
and the frequency transfer function of the filter we have 


Tv 2 k 
nes f(ei*) dw =» gr, (19) 
2 —-7 m=0 
in which 4 = min,,,_, f(z) > 0 according to the assumption stated 
in Section II. Equations (17) and (19) may be combined in the 
expression 


k 
Ym? 
m=0 


wm 








k 
> 92 <c/d, forallk, (20) 


m=0 
from which equation (18) follows. 
The same reasoning leads to a proof of the stability of the un- 
constrained predictor. Replacing g; is hin , the impulse response of the 
nonrecursive predictor described in Section 3.3. 


V. PROJECTING-FILTER DESIGN APPROACH 


As we stated in Section II, an attempt to determine the filter co- 
efficients by directly combining equation (1) and equations (13) and 
(14) leads to an intractable set of transcendental equations relating 
the coefficients and the autocovariance function of {2,}. On the other 
hand, the iterative approach introduced in this paper leads to the 
computation of the desired coefficients by means of standard opera- 
tions of arithmetic and matrix algebra. 

Our design method results in a time-varying filter which, starting 
with zero in all memory elements, sequentially predicts 21, v2, °°: 
according to the projecting principle. At each step the filter forms 
the optimum linear combination of the available data. 


Thus we define the process {x{} such that 


w=0, %&k<0O; en 


LL = Bie k20; 
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and we adopt as our prediction of X41 , 


y, = 0, k <0; (22) 
> 


— , / 7 
YE = Pig |Z, A ee iy VLe-n+1) Yr-1) ee og Yr-n} k 0. 


Equation (22) uniquely defines the time-varying linear transformation 
which generates the nonstationary process {y,} from the stationary 
process {z;,} . 

At each step the prediction error of the time-varying filter meets 
the orthogonality conditions of equations (18) and (14) so that He,.14, = 
O and therefore Hy? < ry for all k. Following the proof of the theorem 
in Section 4.2 we can show that with the filter output represented by 


k 
Yun = > Jiktn-iy k2 0, (23) 


7=0 


the time-varying filter possesses the stability property 


k 
lim sup >> giz < ©. (24) 
k-00 7=0 
Furthermore, if this filter approaches the time-invariant projecting 
filter with impulse response g; in the sense that 


k 
lim 2 (giz — gi) = 0, (25) 
koa i=0 
we are assured that this filter is stable and that it has the desired 
nth-order recursive structure. Hence if we determine, for each 
k, ag, and by, such that 


n-1 n 
Yu = >: Qik + >» DiYu—i (26) 
7=0 t=1 


is equivalent to equation (22), then successive computation of these 
coefficients leads to the desired time-invariant projecting filter. 

Note that although y; is uniquely determined by equation (22), 
the coefficients ay, and by, in the representation of equation (26) are 
not unique when the set of stored data is linearly dependent. This 
situation is analyzed in Section 7.4. 


VI. MATCHED-ORDER PROCESSES 


We prove in Section 6.1 that when {z;,} is of order n, the projecting- 
filter design technique results in least mean-square time-varying pre- 
diction in the sense that each output yx; is the optimum linear com- 
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bination of the entire observed past of {x,}. Thus y;, is equal to the 
output of the optimal nonrecursive filter of order k + 1 as described 
in Section 3.2 so that 


lim (y, — isi) = 0, 
ko 


indicating that the design algorithm converges to the optimal uncon- 
strained predictor. In Section 6.2, we derive simple formulas for the 
filter coefficients generated by the design procedure. 


6.1 Optimality 


We denote by 3, the subspace spanned by the random variables 
in the filter memory at time k: vf , efi1 +++ 5 Ulinery Yenrry °° * y Yron 
and we denote by ®, the subspace spanned by the observed past of 
{ty}: Ly, Ue-1, °** » 2 . Note that another spanning set of ®; is e , 
€x-1» *** » €1 , Zo , Where {e,} is the error sequence of the projecting 
filter. This statement follows by induction since 2) spans ®, and if 
{€; ,@j-1, °° , 1, Xo} Spans ®; then {e;4; ,€;, °+* , 1, Lo} Spans Rj4, 
because €;41 = 2j41 — y; with y; in@; . 

In this section we assume that {2,} is an nth-order process represented 
by equation (9) with a = 1 so that 


Cp4y = Uz+er — dD) atari (27) 
t=] 
in which {w,} is the moving average process with 
Uj+1 = >» Bi€js41-5 (28) 
t=0 


and {e,} is a unit-mean-square white-noise process. Because Za,2* ¥ 0 
for |z| < 1, equation (27) may be expressed in the form 


Uns = sh hi€nai—<; (29) 


t=0 


in which {h;} is square summable. Equation (29) shows that 


Bees sVe-: = 0, 1 = 0, (80) 
and equations (28) and (80) imply 
Bes :2ens = 0, t 2 nN. (31) 


If we let x*,, denote the optimal “growing-memory”’ prediction of 
2,41 With the projection characteristic 
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Lt = Pte | Ri}, 
we have the following 
Theorem: At each instant k, the time-varying filter output defined by 
Ye = Pl2txs1 | Ke} 
ts the optimal growing-memory predictor in the sense that 
Yn = Vihar. (32) 


Proof: We will show that x*,, © 3C, which implies equation (82) because 
5, C R,. Clearly, forO0 S$ k < n, 30, = R, so that y, = x*,, . We assume 
y, = x#,, for all k < 7 and show that this implies y; = 2*,, . Hence, 
by induction, equation (82) is valid for all k. 

Let 7 2 n and assume equation (32) holds for all k < 7. Then 


Heit = 0, for k=0,1,---,7-1; 
P20 Leek. (33) 
This implies that the vectors e; , €;-1, °** , €1, % , Which span ®; are 


mutually orthogonal. Thus a projection into ®; is the sum of the pro- 
jections into each of these basis vectors. In particular 


~-1 
Plujsr | Qi} = Pluger | eo} + DP hei | ess}. (34) 
1=0 
Now note that e;_; ¢ ®,;_; and that equation (31) states that u;.; 1 
G;-; for 7 = n. Thus the first term in equation (84) and all but the first 
n terms of the summation are zero so that 


n-1 
P{ttjs1 | Qs} = Dy Plussr | esa}. (35) 
t=0 


We now consider x*,, by noting that the projection operator is 





linear and that P{x,_;|@,} = v-;forz = 0,1, --- , k. Thus equation 
(27) implies 
the = Plas 10) = Plea (O}— Dats G6) 
or, from equation (35) 
ti = ¥ Pluses [ea = Deas (37) 


i=0 t=1 


Note that the zth term in the first summation is proportional to e;_; 
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so that x*,, is a linear combination of x; , 2-1, +++ ) Vj-nt1) Yiu °° * > 
Yi-n , the basis vectors of 5C;. Thus x*,, ¢ 5C; and 


thy = Pltjsi1| ;} = y;. 
Hence v*,, = y;, for all k. Q.E.D. 


6.2 Filter Coefficients 


In this section we derive explicit recursions for the coefficients and 
mean-square error of the optimal growing-memory predictor of a 
stationary nth-order process. We begin with equation (87) for the 
optimal prediction and observe that the projections have the form 


P{uess | er-s} = Yinenn ; 7=0,1,---,n—1, (38) 
where the coefficients are ratios of two expectations, 
9 ip = HM iOpac/ Be. (39) 
These expectations may be expressed as functions of the auto-covariance 
coefficients, 
0; = Eu; ; (40) 


of the stationary moving average process {u,}. 

Our derivation. begins with the expression of the error at step k, 
Cr+1 = Une1 — VE,,, a8 the difference between equation (27) for x,., 
and equation (37) for z¥,, : 

n-1 


Cr+1 = Uc+i Do Viet (41) 
i=0 


Squaring equation (41) and taking the expectation we obtain 
nl 
Fei a, = fo > > vinbe; (42) 
7=0 


which gives the mean-square error at step k in terms of current filter 
coefficients and past errors. To find the next set of coefficients, y;,241 , 
We express €,41-; a8 in equation (41) and we find the expected product 
of this random variable and w,,2. Then we divide by the mean-square 
indicated in equation (39) with the result 


2 
Yn-1,k+1 = Cn/Hee+2-n ’ 
n—-t-2 
2 1 2 
Vi eri = | Gi41 7 >; Vie iVing41, er wvee-i—; eee ’ 
7=0 


a=n—2,n—-3,-:- ,0, (43) 
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where the upper limit on the sum is a consequence of the property, 
EuUys2 @,-i-; = Oforj = n — 7 — 1. [See equation (81)]. 

The filter coefficients a;, and by, of equation (26) are related to the 
projection coefficients yz, and the autoregressive coefficients, a:, of the 
process representation by 


Qik = Vik ~ isi, (44) 
bi, = — VYi-1,k » 


because equations (37) and (88) combine to form 


n-t n 
Lin = > (Vix = Os 41) Lyi = Oe Vi-1 ERs . (45) 
i=0 t=1 


Our recursive technique for finding the characteristics of the optimal 
nth-order growing memory predictor thus consists of alternately per- 
forming the calculations of equations (42) and (43) and of obtaining 
the filter coefficients at each step by means of equation (45). 


6.3 Convergence of Filter Coefficients 


Since the time-varying filter output y, converges to the optimal 
unconstrained predictor £,,,, one would expect that the time-varying 
coefficients a;, and b,, will converge to constant coefficients a; and 6; . 
Since we have excluded processes with zeros on the unit circle, an nth- 
order recursive structure for the optimal predictor is known to exist.” 
But this is not sufficient. It is also necessary to exclude the possibility 
that the intrinsic order of the process is less than n. Then the coefficients 
of the nth-order recursive equation for the optimal predictor are unique 
and the time-varying coefficients a;, and b,, will in fact converge to 
these constant coefficients. 


6.4 Relation to Kalman Filtering 


In addition to proving convergence of our design approach, we have 
shown for the matched order case that the time-varying filter generated 
by the design procedure is the optimal growing-memory predictor. 
At each instant, k, the 2n stored data samples contain all the needed 
information about the observed past of the process, %,%1, °** , %- 
It follows that the time-varying filter must be identical to the Kalman 
predictor’ which is obtained by expressing the process model in state 
equation form. However, the Kalman development is computationally 
less efficient as may be seen by comparing the Ricatti equations with 
the simpler recursions given in Section 6.2. 
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In recent months recursions similar to ours have been published in 
various contexts. They appear in a paper by J. Rissanen and L. Bar- 
bosa® as steps in the factorization of the covariance matrix of {u;}, the 
nth-order moving average, and Kailath?® has indicated that such 
recursions follow from an innovations approach to prediction. Related 
formulas also appear in R. L. Kashyap’s™ derivation of predictor char- 
acteristics in terms of the parameters o; and f; of the process represen- 
tation. In our derivation, as in Refs. 9 and 10, the basic data are the 
set of a; and the autocovariance function of {u,}. In contrast, the 
new design algorithm presented in Section VII uses only the covariances 
of the process to be predicted, quantities that are often more accessible 
in practice than the process parameters. 


VII. SYNTHESIS TECHNIQUE 


In this section we apply the projecting-filter design approach of 
Section V to obtain a computational algorithm for the general case in 
which the order of the process may differ from the order of the filter. 
The basic idea of the approach is to compute successive sets of weight- 
ing coefficients for an nth-order time-varying recursive filter which 
asymptotically approaches the desired time-invariant projecting filter. 

As discussed in Section V, the time-varying projecting filter of in- 
terest is characterized by the input-output relationship 


Yx = Plates | Hy} (46) 
where 3C; denotes the subspace spanned by the 2n variates 
Mi Seti g AO 4 Digi esr ese, 3S ew 


Equation (46) uniquely defines y, as the projection of x,,, into JC, . 
This projection can be expressed explicitly as a linear combination of 
the 2n variates; that is, 


nal n 
Yu = 2d Gite + > DiYe-i - (47) 


Let d(35C,) denote the dimension of the subspace 3; , ie., d(H) is 
the minimum number of variates needed to span 3, . If d(3C,) = 2n 
then the 2n spanning variates are linearly independent and the coef- 
ficient set used in equation (47) is unique. On the other hand if d(3C,) < 
2n, the 2n spanning variates are linearly dependent and consequently 
there is an infinite number of possible choices for the coefficient set. 
This situation always occurs in the first 2n ~ 1 iterations (0 = k < 
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2n — 1) and it may occur as well in subsequent iterations. To overcome 
this difficulty, we adopt a consistent procedure for selecting a linearly 
independent subset of the 2n spanning variates for each k. Variates 
are eliminated by setting appropriate coefficients to zero in equation 
(47). The remaining coefficients are then uniquely determined from the 
covariance matrix of the remaining variates and the cross-covariances 
between the remaining variates and 244: . 

The algorithm is initialized with yo = dQpo%o and all of the other 
coefficients a;) and b,) (t # 0) set to zero. Then each iteration consists 
of the following steps: (2) solving for the appropriate coefficient values, 
(it) computing the needed covariances for the following iteration, 
(it) determining an independent set of variates for the next prediction. 


7.1 Reduced Representation 


The procedure for eliminating dependent variates from the set of 
available data at time k leads to the following expression [equivalent 
to equation (47)] for the kth prediction 


p-l qa 
Y= pa OLp-2 + 2d biYn-i (48) 
with p S n,q S n.* The coefficients that do not appear in equation 
(48) are all set to zero in the process of eliminating dependent variates; 
that is, 


ai, = 0, (= 97,9 1, ene 1 
bi, = 0, a Gt Or gts ls 


Note that z,.; rather than z{_; appears in equation (48). This is so 
because xji_; = 0 for 7 > k so that any set containing this variate is 
necessarily dependent. Hence, in the initial n steps, p S k — 1. Section 
7.4 presents the general method by which a set of independent variates 
is determined. 


I 


7.2 The Filter Equations 


With the prediction error defined as é,41 = %+1 — Ys, the projecting 
property implies the following orthogonality conditions 


Le,.1%.-1 = 9, b=, Lesh sp ds (49) 
Hens sYe—< a 0, —_ 0, 1, vee yg. 


* Note that p and q depend on k. They will be denoted p(k) and q(k) when 
ambiguity might otherwise arise. 
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By substituting equation (48) for y, into equation (49), we obtain the 
following set of d(3,) linear equations in the d(%,) coefficients: 


p-l @ 
Tia = Ds Ointi-g + De bw — j,k — 2), 
7=0 a=] 


PRO, 1, So pad; 


p-l qa 
w(k cae 1; a yD) DS a;,w(k a t, R= )) Fie D>: b;u(k oa 1, ko Ds 
7=0 t=1 


g=1,2,-++,@; (80) 
in which we have adopted the notation: 
P= HES tay, 
wk, 7) = Hay; , 
v(k, j) = Lyy; . 


The function r; comprises the given statistical information of the pre- 
diction problem and w and v must be expressed as functions of 7; and 
previously computed filter coefficients. 

Equations (50) have the following partitioned matrix form 


. tk 


T, the p X p autocovariance matrix of {2}, 

X, the p X q cross-covariance matrix of {z,} and {y,}, 

V,, the p X p autocovariance matrix of {y;,}, 

A, = [Qox »>UMks *** Ay—1, x1"; 

By, — [Dix p Dox yoy Dail’, 

hy, = [1 »Te, ttt y Pols 

W, = [wk +1,k - L)j.ee oe ls gy’. 
Note that 7, and &, depend only on the given autocovariance function 
r; and on ~p, the number of forward coefficients to be computed. They 
are independent of previously computed coefficients. 

If we perform the multiplication indicated in equation (51) and 
then solve for A, and B, we derive 


B, = [Vi — X(U,X,])"'[Wi — XiC,], 
A, = C, a U,X,B; , 


(52) 
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where U, = T;* and C, = U,R,, the column matrix of weights corre- 
sponding to the optimum pth-order nonrecursive predictor. By using 
efficient algorithms developed for the analysis of nonrecursive pre- 
dictors,’”"** one may successively calculate Up, U1, +++ , Uni, Co, 
Ci, --- , C,-1 before the start of the synthesis procedure so that at 
the kth step, only aq X q matrix inversion [rather than one of order 
(p + q)] is required. We are assured that the matrix to be inverted 
is nonsingular because we have eliminated dependent variates by 
reducing the number of unknowns from 2n to p + gq. Note that A, 
consists of the coefficients of the optimum pth-order nonrecursive 
predictor modified by U,X,B, which indicates the effect of the feed- 
back section of the predicting filter. 


7.3 Obtaining Successive Covariance Statistics 

The nature of w(k, 7) depends on which time index is the greater. 
If 7 = k we observe that the projection property of the jth estimate 
implies that Hx,e; = Ofork =j7 —1,7 — 2, --- ,j — n. Thus if we 
substitute v;.; —é€;+1 for y; in the definition of w(k, 7), we obtain 

wk, 7) = El(@ias — @741)Xel, (53) 
= Tene j=k,k+1,---,k+n-1. 

For j < k, we substitute equation (26) for y; in the definition of w(K, 7), 
with the result 


n—-1 n 
wk, ?) =F a Oil j-K-t ++ > b, wk, j ree 1), 
i=0 iol 


j3=0,1,-:-,k-1. (54) 
Equation (54) indicates that {w(k, 0), w(k, 1), --- , wk, k — 1)} is 
the sequence of filter outputs when {r_; , 7-441, °°: , 7-1} is the sequence 


of inputs. This is an example of the property of linear filters that the 
cross-covariance between input and output is the correlation of the 
filter impulse response with the input autocovariance function. Using 
the initial conditions w(k, 7) = 0 for 7 < 0, we may iteratively apply 
equation (54) in order to compute the required values of w(k, 7) for 
9-<k. 

The autocovariance coefficients of {y,} may be determined from the 
orthogonality conditions. With k — n Sj S k, we have He,..y; = 0 
so that 


o(k, )) ra El (evs — xe) ¥il = wk am 1, D; 
j=k—n,---,k, (55) 
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and of course v(j, k) = v(k, 7). Thus, equations (53), (54), and (55) 
express, in terms of known quantities, the covariance coefficients that 
appear in equation (51). 


7.4 The Number of Independent Variates 


In Section 7.2 we have assumed that p and q, the number of forward 
coefficients and the number of feedback coefficients to be computed 
at time k are determined in a manner that assures the linear independence 
of the » + q variates that appear in equation (48) and therefore, the 
existence of the inverse matrix of equation (52). In many instances 
» = q = nso that all of the data in the predictor memory are linearly 
independent. On the other hand, there are two conditions under which 
the data are dependent. The first is called an initialization condition 
and this arises in the course of every synthesis procedure because the 
predictor begins to operate at k = 0 with zero in all memory elements 
except one. The initialization condition obtains for the first 2n — 2 
iterations of the design procedure during which d(3¢,) < k + 1 < 2n 
because 3, C ®, and d(®,) = k + 1. The other condition under which 
d(3;,) < 2n is called a reduced order condition, which arises when 
certain of the final feedback coefficients and/or final forward coef- 
ficients are zero. A reduced-order condition arises for all processes of 
order less than n. 


7.4.1 Initialization 


In this section we assume that no reduced order condition arises 
during the first 2n — 1 steps of the predictor synthesis. This implies 
that d(3c,) = k + 1s0 that p + q, the number of coefficients determined 
by orthogonality conditions, increases by one at each iteration. At 
k = 0, the predictor estimates x, given x) which implies p = 1, q = 0. 
For increasing k, we alternately increase gq and p by one so that for 
O<sk<2n—-2 

p=1-+ 4k, q = $k, k even; 
p=7hk+)=4, k odd; 


when no reduced order condition arises. Table I shows the variates 
that appear in equation (48) during the initial design stages of a second- 
order predictor. 


(56) 


7.4.2 Reduced-Order Condition 


At time k + 1, the dependency of the data in storage can be de- 
duced by observation of the coefficients computed at time k. In this 
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TABLE I—Sreps In PrEDIcToR DESIGN 


Time Predicted Variate Independent Data Projection 
0 x1 Xo Yo 
1 Xe Hal Yo Y 
2 23 Xo Lt Yi Y2 
3 4 x3 Xe Yo Y1 Y3 
k Ukr XE Tht Yk-1 Yk-2 Yk 


section we show how the values of certain coefficients, in particular 
whether or not they are zero, determine the relationship between 
d(5,) and d(3C,.,), the numbers of linearly independent variates in 
storage at time k and at time k + 1. In the next section we present 
the algorithm for determining the number of forward coefficients and 
the number of feedback coefficients to be computed at each step of 
the design. 

The following theorem states that there is a dependence among the 
variates in storage at time k + 1 if and only if the coefficients deter- 
mined at time k correspond to a filter of order less than n. 


Theorem: With d(3C,) = 2n, d(3Cex1) = 2n — 1 af and only tf a,_1., = 
b,x = 0. Otherwise d(3C,) = 2n. 


Proof: Assume a,-1,, = 0,,, = 0. Then 


n—2 


n—-1 
Y= De OixCe- + Sy DicYn—s 
i=0 i=1 


which shows the linear dependency of the following variates in storage 
at time k -+ 1:2, , Veet) °° * » Ue-ns2ay Yes °° 9 Yeoner- Thus d(x.) < 
2n. On the other hand, the 2n — 1 variates: 241, te, °** ,» Ve-nse; 
Yr-1)°** » Yeon are linearly independent. All except x,., are independent 
because they are in storage at time k and d(3C,) = 2n. In addition, 
the assumption that {z,} is nondeterministic implies that z,., cannot 
be expressed as a linear combination of the other stored variates be- 
cause each of these is in ®,. It follows that d(3,.,;) = 2n — 1. 

To prove the converse, assume d(3C,:1:) = 2n — 1. It follows that 
there exists a linearly dependent set of stored data. By the reasoning 
given above this set does not include z,,, because all of the other stored — 
variates are in ®, . However the set does include y, because all of the 
other variates are independent. Hence y, can be represented as a linear 
combination of 2, , %p-1) °** » Ve-ns2) Yr-ty» °°* » Ye-ns1- But the data 
in storage at time k also includes 2,_,4; and y,-, and the fact that 
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d(5C,) = 2n implies that the representation of y, is unique. Therefore 
we have the coefficients of 2,141; and Yr-n 5 Gn-1.n = On, = O. Q.E.D. 

By reasoning similar to that used to prove this theorem we may 
establish the dimensionality of the data in storage at time k + 1 when 
d(5C,) < 2n. Thus we have the following corollaries which apply for 
all & including the initial steps of the predictor design. 


Corollary 1: With d(3t,) = p+qandp=q <n, divi) =ptq-1l 
if and only tf ay-1,. = by., = 0. Otherwise d(5.41) = pt+tqtl. 
Corollary 2: With d(i,) = p + qandn 2 p = @Qt1, d(Risi) = 
p + q tf and only tf a,-1,, = 0. Otherwise d(5,41) = p +a 1. 
Corollary 3: With d(3,) ptqandp=q—1 <n, d(Sxs1) = 
p+ qtf and only tf by, = 0. Otherwise d(5ia1) = ptqt1. 


7.4.3 The Number of Computed Coefficients 


On the basis of the theorem and corollaries of Section 7.4.2, we 
establish the procedure shown in Table II for determining the numbers 
of forward and feedback coefficients p(k + 1) and q(& + 1) to be com- 
puted at time k + 1. The table indicates that p(k + 1) and qg(k + 1) 
may be determined from p = p(k) and q = q(k) (shown in the left 
column) and from the final two feedback coefficients and the final 


TaBLe IJ—TuHEe NumBer oF CorFFICIENTS COMPUTED 





























Number of Number of 
Coefficients Coefficients 
Computed at Final Coefficients Computed at 
Time & Computed at Time k Time k + 1 
bak bg-1,k Qp-1,k Qp-o,e | p(k +1) q(k +1) 
l p=q=n 0 n n 
2 #0 n n 
3 p=aq 0 40 0 p q-1 
4 0 0 0 p—1 q 
Dep Oe #0 (a q 
6 #0 p q+1 
7 p>q #0 p q+1 
8 0 0 p q 
9 0 0 ~0 p-1l1 q+l 
10 p<q 70 p+l p 
11 0 0 Pp q 
12 0 #0 0 py Cid 
13 any p, q 0 0 0 0 irregular 


2400 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1970 


two forward coefficients (shown in the central four columns) computed 
at time k. If there is no entry for one of the coefficients, the indicated 
relationship between p(k + 1), q(k + 1) and p, q is independent of 
that coefficient. The other symbols indicate that a coefficient must 
necessarily be zero or nonzero for a relationship to be valid. 


If, at time k, p + q = d(8;,), the variates x, , Ui-1, °°* ) Le-psi» 
Yr-1) °** » Ye-qg are independent. This condition and the theorem and 
corollaries imply that the set {441, %, °°* » Ur-peeetya2) Yer “'% 5 


Yr—atk+1)+1} 18 Independent and spans 3C,,,. Thus lines 1 and 2 of 
Table II follow from the theorem; lines 3 through 6, from the theorem 
and Corollary 1; lines 7 through 9, from Corollary 2; and lines 10 
through 12 from Corollary 3. 

The table accounts for all possible combinations of computed coef- 
ficient values except those in which the last two forward coefficients 
and the last two feedback coefficients are all zero. This situation arises 
during the initial design stages whenever the input process is partially 
decorrelated. The manner in which independent variates are chosen 
for such a process is described in Section 7.4.5. When the irregularity 
arises in the design of predictors for other processes, there is no inde- 
pendent basis of 3C,,, that is the union of consecutive members of 
{v,} beginning with x,,; and consecutive members of {y,} beginning 
with y,. Thus it is impossible to represent P{y, | 3C,} in the concise 
form of equation (48). Nor is it possible in general to determine at all 
times subsequent to k an independent set of stored data solely by 
considering p, g and the previously computed coefficients. All this 
serves to complicate quite substantially the representation of y,, the 
equations which determine the coefficients, and the algorithm for 
determining the numbers of coefficients to be computed after the 
occurrence of the irregular condition indicated on the last line of Table IT. 

Rather than add substantially to the size of this paper by presenting 
a general technique for treating this situation, we simply note that 
except for partially decorrelated processes, it has never arisen in our 
experience of designing projecting filters and that in fact it appears to 
represent a pathological case. We have not discovered an example of a 
process for which four projection coefficients are simultaneously zero 
after one or more of their counterparts is nonzero at the previous 
time instant. 


7.4.4 Low Order Processes 


When {z,} is the response of an mth-order filter to white noise and 
m is no greater than n, the order of the predictor, the synthesis method 
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leads to the mth-order form of the optimum unconstrained predictor. 
Section 6.1 contains a proof of this statement for m = n and in this 
section we show that if m < n, a reduced-order situation arises and the 
effective order of the predictor does not grow beyond n. 

Let a,;, and b;, be the coefficients of the optimal growing-memory 
mth-order predictor, determined in the manner indicated in Section 
6.2. Thus 


m-1 ™m 
tha = » Qitte-~i + » Oa tiiay (57) 
i=0 i=1 


Note that for allk < 2m — 1, y, , the output of the nth-order predictor 
is identical to x*,, because the design proceeds as for a predictor of 
order m. 

Equation (56) indicates that at step 2m the initialization procedure 
leads to p = m + 1,q = m and 


Yom = » Q omlom—s + > bE omYom-i (58) 
i=0 i=1 


where aj .,, and b/.,,, are determined uniquely by the orthogonality 
conditions. Hence it follows from the optimality of equation (57) 
that a’,.,, = 0 and that the other coefficients are equal to the ones in 
equation (57) with k = 2m. Line 8 of Table I indicates that p(2m + 1) = 
m + 1 and q(2m + 1) = m and once again we have a’, .,,,, = 0 and 
the other coefficients equal to those in equation (57) for the optimal 
mth-order predictor. It is clear that for all k = 2m this sequence is 
repeated with p(k) = m+ 1, q(k) = manda’, , = 0. Hence the algo- 
rithm converges to the unique mth-order form of the unconstrained 
optimal predictor. 


7.4.5 Partially Decorrelated Input Process 


A partially decorrelated process is a nonwhite process for which 
every set of 7 + 1(G > 0) adjacent samples is uncorrelated. In other 
words, {x,} is partially decorrelated if forsome 7 > 0,7, = 72 = ++: = 
r; = 0 and 7;,, ¥ 0. For example the error process of an nth-order 
projecting filter is partially decorrelated with 7 = n. 

Note that with a partially decorrelated input, the initial 7 generating 
filter outputs (corresponding to optimal nonrecursive predictions) 
are zero. Thus 


y, = ce, = 0 = ay, = by, forO S k < j and all z. (59) 


This is a reduced-order situation conforming to line 13 of Table II 
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(if we assume bo, = O and ay, = by = O for 7 < 0). For this ir- 
regular case we adopt the following initialization procedure as an 
alternative to equation (56). 


(1) All coefficients are 0 for k < 7. 
(wz) pj) =7 + 1,9) = 0. 
(iit) p(k), q(k) according to Table II for k > 7. 


VIII. CONCLUSIONS 


This paper introduces the projecting-filter principle of constrained- 
order recursive prediction and presents one technique of projecting 
filter synthesis. This technique has led to the design of the predictors 
described in Section 1.3 and to several other successful designs for a 
variety of random processes. However, the class of processes for which 
the technique is valid (that is, for which the algorithm converges to 
a time-invariant filter) and indeed the class for which a projecting 
filter of a given order exists have not as yet been determined. These 
questions are the subject of current research. Another important area 
of investigation involves the numerical aspect of the synthesis—the 
study of the sensitivity of this or any other design method to round- 
off in the calculation of coefficients. 

Our studies to date indicate that the projecting filter is valuable in 
that it predicts many processes more accurately than other known 
devices of equal complexity. Our results are readily extended to vector- 
valued processes. Finally, we note that the projecting filter principle 
is applicable to a large class of estimation problems of which prediction 
one unit of time in the future is but a single example. 
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